Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 2 de 2
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Bioinformatics ; 21 Suppl 2: ii268-9, 2005 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-16204117

RESUMO

UNLABELLED: Biologists routinely use Microsoft Office applications for standard analysis tasks. Despite ubiquitous internet resources, information needed for everyday work is often not directly and seamlessly available. Here we describe a very simple and easily extendable mechanism using Web Services to enrich standard MS Office applications with internet resources. We demonstrate its capabilities by providing a Web-based thesaurus for biological objects, which maps names to database identifiers and vice versa via an appropriate synonym list. The client application ProTag makes these features available in MS Office applications using Smart Tags and Add-Ins. AVAILABILITY: http://services.bio.ifi.lmu.de/prothesaurus/


Assuntos
Biologia Computacional/métodos , Bases de Dados Factuais , Internet , Processamento de Linguagem Natural , Interface Usuário-Computador , Vocabulário Controlado , Processamento de Texto/métodos , Sistemas de Gerenciamento de Base de Dados , Armazenamento e Recuperação da Informação/métodos , Terminologia como Assunto
2.
BMC Bioinformatics ; 6 Suppl 1: S15, 2005.
Artigo em Inglês | MEDLINE | ID: mdl-15960827

RESUMO

BACKGROUND: Significant parts of biological knowledge are available only as unstructured text in articles of biomedical journals. By automatically identifying gene and gene product (protein) names and mapping these to unique database identifiers, it becomes possible to extract and integrate information from articles and various data sources. We present a simple and efficient approach that identifies gene and protein names in texts and returns database identifiers for matches. It has been evaluated in the recent BioCreAtIvE entity extraction and mention normalization task by an independent jury. METHODS: Our approach is based on the use of synonym lists that map the unique database identifiers for each gene/protein to the different synonym names. For yeast and mouse, synonym lists were used as provided by the organizers who generated them from public model organism databases. The synonym list for fly was generated directly from the corresponding organism database. The lists were then extensively curated in largely automated procedure and matched against MEDLINE abstracts by exact text matching. Rule-based and support vector machine-based post filters were designed and applied to improve precision. RESULTS: Our procedure showed high recall and precision with F-measures of 0.897 for yeast and 0.764/0.773 for mouse in the BioCreAtIvE assessment (Task 1B) and 0.768 for fly in a post-evaluation. CONCLUSION: The results were close to the best over all submissions. Depending on the synonym properties it can be crucial to consider context and to filter out erroneous matches. This is especially important for fly, which has a very challenging nomenclature for the protein name identification task. Here, the support vector machine-based post filter proved to be very effective.


Assuntos
Reconhecimento Automatizado de Padrão/métodos , Reconhecimento Automatizado de Padrão/normas , Proteínas/classificação , Terminologia como Assunto , Animais , Bases de Dados Factuais/classificação , Drosophila , Camundongos , Proteínas/genética , Saccharomyces cerevisiae/química , Saccharomyces cerevisiae/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...