Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 2 de 2
Filter
Add more filters










Database
Language
Publication year range
1.
Bioinformatics ; 21(9): 2049-58, 2005 May 01.
Article in English | MEDLINE | ID: mdl-15657104

ABSTRACT

MOTIVATION: The advent of high-throughput experiments in molecular biology creates a need for methods to efficiently extract and use information for large numbers of genes. Recently, the associative concept space (ACS) has been developed for the representation of information extracted from biomedical literature. The ACS is a Euclidean space in which thesaurus concepts are positioned and the distances between concepts indicates their relatedness. The ACS uses co-occurrence of concepts as a source of information. In this paper we evaluate how well the system can retrieve functionally related genes and we compare its performance with a simple gene co-occurrence method. RESULTS: To assess the performance of the ACS we composed a test set of five groups of functionally related genes. With the ACS good scores were obtained for four of the five groups. When compared to the gene co-occurrence method, the ACS is capable of revealing more functional biological relations and can achieve results with less literature available per gene. Hierarchical clustering was performed on the ACS output, as a potential aid to users, and was found to provide useful clusters. Our results suggest that the algorithm can be of value for researchers studying large numbers of genes. AVAILABILITY: The ACS program is available upon request from the authors.


Subject(s)
Database Management Systems , Information Storage and Retrieval/methods , Natural Language Processing , Periodicals as Topic , Protein Interaction Mapping/methods , Proteins/classification , Proteins/metabolism , PubMed , Artificial Intelligence , Meta-Analysis as Topic , Vocabulary, Controlled
2.
Bioinformatics ; 20(16): 2597-604, 2004 Nov 01.
Article in English | MEDLINE | ID: mdl-15130936

ABSTRACT

MOTIVATION: Full-text documents potentially hold more information than their abstracts, but require more resources for processing. We investigated the added value of full text over abstracts in terms of information content and occurrences of gene symbol--gene name combinations that can resolve gene-symbol ambiguity. RESULTS: We analyzed a set of 3902 biomedical full-text articles. Different keyword measures indicate that information density is highest in abstracts, but that the information coverage in full texts is much greater than in abstracts. Analysis of five different standard sections of articles shows that the highest information coverage is located in the results section. Still, 30-40% of the information mentioned in each section is unique to that section. Only 30% of the gene symbols in the abstract are accompanied by their corresponding names, and a further 8% of the gene names are found in the full text. In the full text, only 18% of the gene symbols are accompanied by their gene names.


Subject(s)
Abstracting and Indexing/methods , Abstracting and Indexing/standards , Biomedical Research/statistics & numerical data , Genes , Information Storage and Retrieval/methods , Natural Language Processing , Periodicals as Topic/statistics & numerical data , Bibliometrics , Information Dissemination/methods , MEDLINE/statistics & numerical data , Terminology as Topic
SELECTION OF CITATIONS
SEARCH DETAIL
...