Pesquisa | Portal Regional da BVS

A hybrid approach to extract protein-protein interactions.

Bui, Quoc-Chinh; Katrenko, Sophia; Sloot, Peter M A.

Bioinformatics ; 27(2): 259-65, 2011 Jan 15.

Artigo em Inglês | MEDLINE | ID: mdl-21062765

RESUMO

MOTIVATION: Protein-protein interactions (PPIs) play an important role in understanding biological processes. Although recent research in text mining has achieved a significant progress in automatic PPI extraction from literature, performance of existing systems still needs to be improved. RESULTS: In this study, we propose a novel algorithm for extracting PPIs from literature which consists of two phases. First, we automatically categorize the data into subsets based on its semantic properties and extract candidate PPI pairs from these subsets. Second, we apply support vector machines (SVMs) to classify candidate PPI pairs using features specific for each subset. We obtain promising results on five benchmark datasets: AIMed, BioInfer, HPRD50, IEPA and LLL with F-scores ranging from 60% to 84%, which are comparable with the state-of-the-art PPI extraction systems. Furthermore, our system achieves the best performance on cross-corpora evaluation and comparative performance in terms of computational efficiency. AVAILABILITY: The source code and scripts used in this article are available for academic use at http://staff.science.uva.nl/~bui/PPIs.zip CONTACT: bqchinh@gmail.com.

Assuntos

Mineração de Dados/métodos , Mapeamento de Interação de Proteínas , Algoritmos , Animais , Inteligência Artificial

Structuring and extracting knowledge for the support of hypothesis generation in molecular biology.

Roos, Marco; Marshall, M Scott; Gibson, Andrew P; Schuemie, Martijn; Meij, Edgar; Katrenko, Sophia; van Hage, Willem Robert; Krommydas, Konstantinos; Adriaans, Pieter W.

BMC Bioinformatics ; 10 Suppl 10: S9, 2009 Oct 01.

Artigo em Inglês | MEDLINE | ID: mdl-19796406

RESUMO

BACKGROUND: Hypothesis generation in molecular and cellular biology is an empirical process in which knowledge derived from prior experiments is distilled into a comprehensible model. The requirement of automated support is exemplified by the difficulty of considering all relevant facts that are contained in the millions of documents available from PubMed. Semantic Web provides tools for sharing prior knowledge, while information retrieval and information extraction techniques enable its extraction from literature. Their combination makes prior knowledge available for computational analysis and inference. While some tools provide complete solutions that limit the control over the modeling and extraction processes, we seek a methodology that supports control by the experimenter over these critical processes. RESULTS: We describe progress towards automated support for the generation of biomolecular hypotheses. Semantic Web technologies are used to structure and store knowledge, while a workflow extracts knowledge from text. We designed minimal proto-ontologies in OWL for capturing different aspects of a text mining experiment: the biological hypothesis, text and documents, text mining, and workflow provenance. The models fit a methodology that allows focus on the requirements of a single experiment while supporting reuse and posterior analysis of extracted knowledge from multiple experiments. Our workflow is composed of services from the 'Adaptive Information Disclosure Application' (AIDA) toolkit as well as a few others. The output is a semantic model with putative biological relations, with each relation linked to the corresponding evidence. CONCLUSION: We demonstrated a 'do-it-yourself' approach for structuring and extracting knowledge in the context of experimental research on biomolecular mechanisms. The methodology can be used to bootstrap the construction of semantically rich biological models using the results of knowledge extraction processes. Models specific to particular experiments can be constructed that, in turn, link with other semantic models, creating a web of knowledge that spans experiments. Mapping mechanisms can link to other knowledge resources such as OBO ontologies or SKOS vocabularies. AIDA Web Services can be used to design personalized knowledge extraction procedures. In our example experiment, we found three proteins (NF-Kappa B, p21, and Bax) potentially playing a role in the interplay between nutrients and epigenetic gene regulation.

Assuntos

Armazenamento e Recuperação da Informação/métodos , Biologia Molecular , Biologia Computacional/métodos , Bases de Dados Factuais , Internet , PubMed , Software , Vocabulário Controlado

Overview of BioCreative II gene mention recognition.

Smith, Larry; Tanabe, Lorraine K; Ando, Rie Johnson nee; Kuo, Cheng-Ju; Chung, I-Fang; Hsu, Chun-Nan; Lin, Yu-Shi; Klinger, Roman; Friedrich, Christoph M; Ganchev, Kuzman; Torii, Manabu; Liu, Hongfang; Haddow, Barry; Struble, Craig A; Povinelli, Richard J; Vlachos, Andreas; Baumgartner, William A; Hunter, Lawrence; Carpenter, Bob; Tsai, Richard Tzong-Han; Dai, Hong-Jie; Liu, Feng; Chen, Yifei; Sun, Chengjie; Katrenko, Sophia; Adriaans, Pieter; Blaschke, Christian; Torres, Rafael; Neves, Mariana; Nakov, Preslav; Divoli, Anna; Maña-López, Manuel; Mata, Jacinto; Wilbur, W John.

Genome Biol ; 9 Suppl 2: S2, 2008.

Artigo em Inglês | MEDLINE | ID: mdl-18834493

RESUMO

Nineteen teams presented results for the Gene Mention Task at the BioCreative II Workshop. In this task participants designed systems to identify substrings in sentences corresponding to gene name mentions. A variety of different methods were used and the results varied with a highest achieved F1 score of 0.8721. Here we present brief descriptions of all the methods used and a statistical analysis of the results. We also demonstrate that, by combining the results from all submissions, an F score of 0.9066 is feasible, and furthermore that the best result makes use of the lowest scoring submissions.

Assuntos

Biologia Computacional/métodos , Genes , Sociedades Científicas , Congressos como Assunto

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA