Pesquisa | Portal Regional da BVS (teste)

Comparing the prediction performance of item response theory and machine learning methods on item responses for educational assessments.

Park, Jung Yeon; Dedja, Klest; Pliakos, Konstantinos; Kim, Jinho; Joo, Sean; Cornillie, Frederik; Vens, Celine; Van den Noortgate, Wim.

Behav Res Methods ; 55(4): 2109-2124, 2023 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-35819719

RESUMO

To obtain more accurate and robust feedback information from the students' assessment outcomes and to communicate it to students and optimize teaching and learning strategies, educational researchers and practitioners must critically reflect on whether the existing methods of data analytics are capable of retrieving the information provided in the database. This study compared and contrasted the prediction performance of an item response theory method, particularly the use of an explanatory item response model (EIRM), and six supervised machine learning (ML) methods for predicting students' item responses in educational assessments, considering student- and item-related background information. Each of seven prediction methods was evaluated through cross-validation approaches under three prediction scenarios: (a) unrealized responses of new students to existing items, (b) unrealized responses of existing students to new items, and (c) missing responses of existing students to existing items. The results of a simulation study and two real-life assessment data examples showed that employing student- and item-related background information in addition to the item response data substantially increases the prediction accuracy for new students or items. We also found that the EIRM is as competitive as the best performing ML methods in predicting the student performance outcomes for the educational assessment datasets.

Assuntos

Avaliação Educacional , Estudantes , Humanos , Simulação por Computador , Escolaridade , Aprendizado de Máquina

Predicting Drug-Target Interactions With Multi-Label Classification and Label Partitioning.

Pliakos, Konstantinos; Vens, Celine; Tsoumakas, Grigorios.

IEEE/ACM Trans Comput Biol Bioinform ; 18(4): 1596-1607, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-31689203

RESUMO

Identifying drug-target interactions is crucial for drug discovery. Despite modern technologies used in drug screening, experimental identification of drug-target interactions is an extremely demanding task. Predicting drug-target interactions in silico can thereby facilitate drug discovery as well as drug repositioning. Various machine learning models have been developed over the years to predict such interactions. Multi-output learning models in particular have drawn the attention of the scientific community due to their high predictive performance and computational efficiency. These models are based on the assumption that all the labels are correlated with each other. However, this assumption is too optimistic. Here, we address drug-target interaction prediction as a multi-label classification task that is combined with label partitioning. We show that building multi-output learning models over groups (clusters) of labels often leads to superior results. The performed experiments confirm the efficiency of the proposed framework.

Assuntos

Biologia Computacional/métodos , Desenvolvimento de Medicamentos/métodos , Descoberta de Drogas/métodos , Aprendizado de Máquina

Drug-target interaction prediction with tree-ensemble learning and output space reconstruction.

Pliakos, Konstantinos; Vens, Celine.

BMC Bioinformatics ; 21(1): 49, 2020 Feb 07.

Artigo em Inglês | MEDLINE | ID: mdl-32033537

RESUMO

BACKGROUND: Computational prediction of drug-target interactions (DTI) is vital for drug discovery. The experimental identification of interactions between drugs and target proteins is very onerous. Modern technologies have mitigated the problem, leveraging the development of new drugs. However, drug development remains extremely expensive and time consuming. Therefore, in silico DTI predictions based on machine learning can alleviate the burdensome task of drug development. Many machine learning approaches have been proposed over the years for DTI prediction. Nevertheless, prediction accuracy and efficiency are persisting problems that still need to be tackled. Here, we propose a new learning method which addresses DTI prediction as a multi-output prediction task by learning ensembles of multi-output bi-clustering trees (eBICT) on reconstructed networks. In our setting, the nodes of a DTI network (drugs and proteins) are represented by features (background information). The interactions between the nodes of a DTI network are modeled as an interaction matrix and compose the output space in our problem. The proposed approach integrates background information from both drug and target protein spaces into the same global network framework. RESULTS: We performed an empirical evaluation, comparing the proposed approach to state of the art DTI prediction methods and demonstrated the effectiveness of the proposed approach in different prediction settings. For evaluation purposes, we used several benchmark datasets that represent drug-protein networks. We show that output space reconstruction can boost the predictive performance of tree-ensemble learning methods, yielding more accurate DTI predictions. CONCLUSIONS: We proposed a new DTI prediction method where bi-clustering trees are built on reconstructed networks. Building tree-ensemble learning models with output space reconstruction leads to superior prediction results, while preserving the advantages of tree-ensembles, such as scalability, interpretability and inductive setting.

Assuntos

Descoberta de Drogas/métodos , Aprendizado de Máquina , Proteínas/efeitos dos fármacos , Análise por Conglomerados , Simulação por Computador , Desenvolvimento de Medicamentos

Network inference with ensembles of bi-clustering trees.

Pliakos, Konstantinos; Vens, Celine.

BMC Bioinformatics ; 20(1): 525, 2019 Oct 28.

Artigo em Inglês | MEDLINE | ID: mdl-31660848

RESUMO

BACKGROUND: Network inference is crucial for biomedicine and systems biology. Biological entities and their associations are often modeled as interaction networks. Examples include drug protein interaction or gene regulatory networks. Studying and elucidating such networks can lead to the comprehension of complex biological processes. However, usually we have only partial knowledge of those networks and the experimental identification of all the existing associations between biological entities is very time consuming and particularly expensive. Many computational approaches have been proposed over the years for network inference, nonetheless, efficiency and accuracy are still persisting open problems. Here, we propose bi-clustering tree ensembles as a new machine learning method for network inference, extending the traditional tree-ensemble models to the global network setting. The proposed approach addresses the network inference problem as a multi-label classification task. More specifically, the nodes of a network (e.g., drugs or proteins in a drug-protein interaction network) are modelled as samples described by features (e.g., chemical structure similarities or protein sequence similarities). The labels in our setting represent the presence or absence of links connecting the nodes of the interaction network (e.g., drug-protein interactions in a drug-protein interaction network). RESULTS: We extended traditional tree-ensemble methods, such as extremely randomized trees (ERT) and random forests (RF) to ensembles of bi-clustering trees, integrating background information from both node sets of a heterogeneous network into the same learning framework. We performed an empirical evaluation, comparing the proposed approach to currently used tree-ensemble based approaches as well as other approaches from the literature. We demonstrated the effectiveness of our approach in different interaction prediction (network inference) settings. For evaluation purposes, we used several benchmark datasets that represent drug-protein and gene regulatory networks. We also applied our proposed method to two versions of a chemical-protein association network extracted from the STITCH database, demonstrating the potential of our model in predicting non-reported interactions. CONCLUSIONS: Bi-clustering trees outperform existing tree-based strategies as well as machine learning methods based on other algorithms. Since our approach is based on tree-ensembles it inherits the advantages of tree-ensemble learning, such as handling of missing values, scalability and interpretability.

Assuntos

Análise por Conglomerados , Algoritmos , Bases de Dados Factuais , Redes Reguladoras de Genes , Aprendizado de Máquina , Mapas de Interação de Proteínas , Proteínas/metabolismo

Mining features for biomedical data using clustering tree ensembles.

Pliakos, Konstantinos; Vens, Celine.

J Biomed Inform ; 85: 40-48, 2018 09.

Artigo em Inglês | MEDLINE | ID: mdl-30012356

RESUMO

The volume of biomedical data available to the machine learning community grows very rapidly. A rational question is how informative these data really are or how discriminant the features describing the data instances are. Several biomedical datasets suffer from lack of variance in the instance representation, or even worse, contain instances with identical features and different class labels. Indisputably, this directly affects the performance of machine learning algorithms, as well as the ability to interpret their results. In this article, we emphasize on the aforementioned problem and propose a target-informed feature induction method based on tree ensemble learning. The method brings more variance into the data representation, thereby potentially increasing predictive performance of a learner applied to the induced features. The contribution of this article is twofold. Firstly, a problem affecting the quality of biomedical data is highlighted, and secondly, a method to handle that problem is proposed. The efficiency of the presented approach is validated on multi-target prediction tasks. The obtained results indicate that the proposed approach is able to boost the discrimination between the data instances and increase the predictive performance.

Assuntos

Análise por Conglomerados , Mineração de Dados/métodos , Árvores de Decisões , Aprendizado de Máquina , Algoritmos , Biologia Computacional , Bases de Dados Factuais/estatística & dados numéricos , Escherichia coli/genética , Escherichia coli/metabolismo , Redes Reguladoras de Genes , Humanos , Redes e Vias Metabólicas , Mapas de Interação de Proteínas , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo

Erratum to: Making sense of big data in health research: towards an EU action plan.

Auffray, Charles; Balling, Rudi; Barroso, Inês; Bencze, László; Benson, Mikael; Bergeron, Jay; Bernal-Delgado, Enrique; Blomberg, Niklas; Bock, Christoph; Conesa, Ana; Del Signore, Susanna; Delogne, Christophe; Devilee, Peter; Di Meglio, Alberto; Eijkemans, Marinus; Flicek, Paul; Graf, Norbert; Grimm, Vera; Guchelaar, Henk-Jan; Guo, Yi-Ke; Gut, Ivo Glynne; Hanbury, Allan; Hanif, Shahid; Hilgers, Ralf-Dieter; Honrado, Ángel; Hose, D Rod; Houwing-Duistermaat, Jeanine; Hubbard, Tim; Janacek, Sophie Helen; Karanikas, Haralampos; Kievits, Tim; Kohler, Manfred; Kremer, Andreas; Lanfear, Jerry; Lengauer, Thomas; Maes, Edith; Meert, Theo; Müller, Werner; Nickel, Dörthe; Oledzki, Peter; Pedersen, Bertrand; Petkovic, Milan; Pliakos, Konstantinos; Rattray, Magnus; I Màs, Josep Redón; Schneider, Reinhard; Sengstag, Thierry; Serra-Picamal, Xavier; Spek, Wouter; Vaas, Lea A I.

Genome Med ; 8(1): 118, 2016 11 07.

Artigo em Inglês | MEDLINE | ID: mdl-27821178

Making sense of big data in health research: Towards an EU action plan.

Genome Med ; 8(1): 71, 2016 06 23.

Artigo em Inglês | MEDLINE | ID: mdl-27338147

RESUMO

Medicine and healthcare are undergoing profound changes. Whole-genome sequencing and high-resolution imaging technologies are key drivers of this rapid and crucial transformation. Technological innovation combined with automation and miniaturization has triggered an explosion in data production that will soon reach exabyte proportions. How are we going to deal with this exponential increase in data production? The potential of "big data" for improving health is enormous but, at the same time, we face a wide range of challenges to overcome urgently. Europe is very proud of its cultural diversity; however, exploitation of the data made available through advances in genomic medicine, imaging, and a wide range of mobile health applications or connected devices is hampered by numerous historical, technical, legal, and political barriers. European health systems and databases are diverse and fragmented. There is a lack of harmonization of data formats, processing, analysis, and data transfer, which leads to incompatibilities and lost opportunities. Legal frameworks for data sharing are evolving. Clinicians, researchers, and citizens need improved methods, tools, and training to generate, analyze, and query data effectively. Addressing these barriers will contribute to creating the European Single Market for health, which will improve health and healthcare for all Europeans.

Assuntos

Pesquisa Biomédica/legislação & jurisprudência , Bases de Dados Factuais/normas , União Europeia/organização & administração , Pesquisa Biomédica/normas , Bases de Dados Factuais/legislação & jurisprudência , Implementação de Plano de Saúde , Humanos , Disseminação de Informação/legislação & jurisprudência

Decreased CD3+CD16+ natural killer-like T-cell percentage and zeta-chain expression accompany chronic inflammation in haemodialysis patients.

Eleftheriadis, Theodoros; Kartsios, Charalambos; Yiannaki, Efi; Antoniadi, Georgia; Kazila, Polizo; Pliakos, Konstantinos; Liakopoulos, Vassilios; Markala, Dimitra.

Nephrology (Carlton) ; 14(5): 471-5, 2009 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-19486472

RESUMO

AIM: Clinical and experimental data indicate a deficient immune response in haemodialysis (HD) patients. Natural killer-like (NKL) T cells express on their surface both the T-cell antigen receptor (TCR) and a diverse set of NK-cell receptors (NKR) and share properties of both T cells and NK cells. zeta-Chain phosphorylation is an early event that follows TCR activation or some NKR activation. The zeta-chain of both T cell and NK cells is downregulated in many chronic inflammatory states, HD included. In the present study, NKL T-cell percentage and zeta-chain expression in HD patients were evaluated. METHODS: Thirty-three stable HD patients and 30 healthy volunteers were enrolled into the study. NKL T-cell percentage and NKL T-cell zeta-chain mean fluorescence intensity (MFI) were evaluated with flow cytometry. The inflammatory markers C-reactive protein, interleukin-6 and tumour necrosis factor-alpha were measured in the serum by means of enzyme-linked immunosorbent assay. RESULTS: All the evaluated markers of inflammation were increased in HD patients. In these patients, NKL T-cell percentage (1.71 +/- 1.69% vs 3.94 +/- 3.86%) and zeta-chain MFI (3.66 +/- 2.79 vs 7.03 +/- 7.91) were decreased. CONCLUSIONS: NKL T-cell percentage and zeta-chain expression is decreased in HD patients. Taking into consideration the continuously increasing age of the HD patients and that normally NKL T-cell numbers increase with age counteracting the impaired T-cell and NK-cell function accompanying advancing age, the above NKL T-cell disturbances could contribute to the impaired immune response in this population. Measures towards alleviating chronic inflammation could partially restore NKL T-cell impairment.

Assuntos

Complexo CD3/análise , Inflamação/etiologia , Falência Renal Crônica/imunologia , Células Matadoras Naturais/imunologia , Receptores de Antígenos de Linfócitos T/análise , Receptores de IgG/análise , Diálise Renal , Adulto , Idoso , Envelhecimento/imunologia , Proteína C-Reativa/análise , Doença Crônica , Feminino , Humanos , Interleucina-6/sangue , Masculino , Pessoa de Meia-Idade

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA