Pesquisa | Portal Regional da BVS

Automating curation using a natural language processing pipeline.

Alex, Beatrice; Grover, Claire; Haddow, Barry; Kabadjov, Mijail; Klein, Ewan; Matthews, Michael; Tobin, Richard; Wang, Xinglong.

Genome Biol ; 9 Suppl 2: S10, 2008.

Artigo em Inglês | MEDLINE | ID: mdl-18834488

RESUMO

BACKGROUND: The tasks in BioCreative II were designed to approximate some of the laborious work involved in curating biomedical research papers. The approach to these tasks taken by the University of Edinburgh team was to adapt and extend the existing natural language processing (NLP) system that we have developed as part of a commercial curation assistant. Although this paper concentrates on using NLP to assist with curation, the system can be equally employed to extract types of information from the literature that is immediately relevant to biologists in general. RESULTS: Our system was among the highest performing on the interaction subtasks, and competitive performance on the gene mention task was achieved with minimal development effort. For the gene normalization task, a string matching technique that can be quickly applied to new domains was shown to perform close to average. CONCLUSION: The technologies being developed were shown to be readily adapted to the BioCreative II tasks. Although high performance may be obtained on individual tasks such as gene mention recognition and normalization, and document classification, tasks in which a number of components must be combined, such as detection and normalization of interacting protein pairs, are still challenging for NLP systems.

Assuntos

Automação , Processamento de Linguagem Natural , Genes , Reprodutibilidade dos Testes

Overview of BioCreative II gene mention recognition.

Smith, Larry; Tanabe, Lorraine K; Ando, Rie Johnson nee; Kuo, Cheng-Ju; Chung, I-Fang; Hsu, Chun-Nan; Lin, Yu-Shi; Klinger, Roman; Friedrich, Christoph M; Ganchev, Kuzman; Torii, Manabu; Liu, Hongfang; Haddow, Barry; Struble, Craig A; Povinelli, Richard J; Vlachos, Andreas; Baumgartner, William A; Hunter, Lawrence; Carpenter, Bob; Tsai, Richard Tzong-Han; Dai, Hong-Jie; Liu, Feng; Chen, Yifei; Sun, Chengjie; Katrenko, Sophia; Adriaans, Pieter; Blaschke, Christian; Torres, Rafael; Neves, Mariana; Nakov, Preslav; Divoli, Anna; Maña-López, Manuel; Mata, Jacinto; Wilbur, W John.

Genome Biol ; 9 Suppl 2: S2, 2008.

Artigo em Inglês | MEDLINE | ID: mdl-18834493

RESUMO

Nineteen teams presented results for the Gene Mention Task at the BioCreative II Workshop. In this task participants designed systems to identify substrings in sentences corresponding to gene name mentions. A variety of different methods were used and the results varied with a highest achieved F1 score of 0.8721. Here we present brief descriptions of all the methods used and a statistical analysis of the results. We also demonstrate that, by combining the results from all submissions, an F score of 0.9066 is feasible, and furthermore that the best result makes use of the lowest scoring submissions.

Assuntos

Biologia Computacional/métodos , Genes , Sociedades Científicas , Congressos como Assunto

Introducing meta-services for biomedical information extraction.

Leitner, Florian; Krallinger, Martin; Rodriguez-Penagos, Carlos; Hakenberg, Jörg; Plake, Conrad; Kuo, Cheng-Ju; Hsu, Chun-Nan; Tsai, Richard Tzong-Han; Hung, Hsi-Chuan; Lau, William W; Johnson, Calvin A; Saetre, Rune; Yoshida, Kazuhiro; Chen, Yan Hua; Kim, Sun; Shin, Soo-Yong; Zhang, Byoung-Tak; Baumgartner, William A; Hunter, Lawrence; Haddow, Barry; Matthews, Michael; Wang, Xinglong; Ruch, Patrick; Ehrler, Frédéric; Ozgür, Arzucan; Erkan, Günes; Radev, Dragomir R; Krauthammer, Michael; Luong, ThaiBinh; Hoffmann, Robert; Sander, Chris; Valencia, Alfonso.

Genome Biol ; 9 Suppl 2: S6, 2008.

Artigo em Inglês | MEDLINE | ID: mdl-18834497

RESUMO

We introduce the first meta-service for information extraction in molecular biology, the BioCreative MetaServer (BCMS; http://bcms.bioinfo.cnio.es/). This prototype platform is a joint effort of 13 research groups and provides automatically generated annotations for PubMed/Medline abstracts. Annotation types cover gene names, gene IDs, species, and protein-protein interactions. The annotations are distributed by the meta-server in both human and machine readable formats (HTML/XML). This service is intended to be used by biomedical researchers and database annotators, and in biomedical language processing. The platform allows direct comparison, unified access, and result aggregation of the annotations.

Assuntos

Pesquisa Biomédica/métodos , Biologia Computacional/métodos , Armazenamento e Recuperação da Informação , Internet , Humanos

Assisted curation: does text mining really help?

Alex, Beatrice; Grover, Claire; Haddow, Barry; Kabadjov, Mijail; Klein, Ewan; Matthews, Michael; Roebuck, Stuart; Tobin, Richard; Wang, Xinglong.

Pac Symp Biocomput ; : 556-67, 2008.

Artigo em Inglês | MEDLINE | ID: mdl-18229715

RESUMO

Although text mining shows considerable promise as a tool for supporting the curation of biomedical text, there is little concrete evidence as to its effectiveness. We report on three experiments measuring the extent to which curation can be speeded up with assistance from Natural Language Processing (NLP), together with subjective feedback from curators on the usability of a curation tool that integrates NLP hypotheses for protein-protein interactions (PPIs). In our curation scenario, we found that a maximum speed-up of 1/3 in curation time can be expected if NLP output is perfectly accurate. The preference of one curator for consistent NLP output and output with high recall needs to be confirmed in a larger study with several curators.

Assuntos

Bases de Dados Factuais , Armazenamento e Recuperação da Informação , Processamento de Linguagem Natural , Inteligência Artificial , Biologia Computacional , Mapeamento de Interação de Proteínas/estatística & dados numéricos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA