Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 8 de 8
Filter
Add more filters










Database
Language
Publication year range
1.
BMC Bioinformatics ; 13: 16, 2012 Jan 26.
Article in English | MEDLINE | ID: mdl-22280404

ABSTRACT

BACKGROUND: Curation of information from bioscience literature into biological knowledge databases is a crucial way of capturing experimental information in a computable form. During the biocuration process, a critical first step is to identify from all published literature the papers that contain results for a specific data type the curator is interested in annotating. This step normally requires curators to manually examine many papers to ascertain which few contain information of interest and thus, is usually time consuming. We developed an automatic method for identifying papers containing these curation data types among a large pool of published scientific papers based on the machine learning method Support Vector Machine (SVM). This classification system is completely automatic and can be readily applied to diverse experimental data types. It has been in use in production for automatic categorization of 10 different experimental datatypes in the biocuration process at WormBase for the past two years and it is in the process of being adopted in the biocuration process at FlyBase and the Saccharomyces Genome Database (SGD). We anticipate that this method can be readily adopted by various databases in the biocuration community and thereby greatly reducing time spent on an otherwise laborious and demanding task. We also developed a simple, readily automated procedure to utilize training papers of similar data types from different bodies of literature such as C. elegans and D. melanogaster to identify papers with any of these data types for a single database. This approach has great significance because for some data types, especially those of low occurrence, a single corpus often does not have enough training papers to achieve satisfactory performance. RESULTS: We successfully tested the method on ten data types from WormBase, fifteen data types from FlyBase and three data types from Mouse Genomics Informatics (MGI). It is being used in the curation work flow at WormBase for automatic association of newly published papers with ten data types including RNAi, antibody, phenotype, gene regulation, mutant allele sequence, gene expression, gene product interaction, overexpression phenotype, gene interaction, and gene structure correction. CONCLUSIONS: Our methods are applicable to a variety of data types with training set containing several hundreds to a few thousand documents. It is completely automatic and, thus can be readily incorporated to different workflow at different literature-based databases. We believe that the work presented here can contribute greatly to the tremendous task of automating the important yet labor-intensive biocuration effort.


Subject(s)
Artificial Intelligence , Databases, Factual , Databases, Genetic , Animals , Automation , Caenorhabditis elegans/genetics , Drosophila melanogaster/genetics , Genomics , Mice/genetics , Publications , Support Vector Machine
2.
BMC Bioinformatics ; 12: 32, 2011 Jan 24.
Article in English | MEDLINE | ID: mdl-21261995

ABSTRACT

BACKGROUND: Caenorhabditis elegans gene-based phenotype information dates back to the 1970's, beginning with Sydney Brenner and the characterization of behavioral and morphological mutant alleles via classical genetics in order to understand nervous system function. Since then C. elegans has become an important genetic model system for the study of basic biological and biomedical principles, largely through the use of phenotype analysis. Because of the growth of C. elegans as a genetically tractable model organism and the development of large-scale analyses, there has been a significant increase of phenotype data that needs to be managed and made accessible to the research community. To do so, a standardized vocabulary is necessary to integrate phenotype data from diverse sources, permit integration with other data types and render the data in a computable form. RESULTS: We describe a hierarchically structured, controlled vocabulary of terms that can be used to standardize phenotype descriptions in C. elegans, namely the Worm Phenotype Ontology (WPO). The WPO is currently comprised of 1,880 phenotype terms, 74% of which have been used in the annotation of phenotypes associated with greater than 18,000 C. elegans genes. The scope of the WPO is not exclusively limited to C. elegans biology, rather it is devised to also incorporate phenotypes observed in related nematode species. We have enriched the value of the WPO by integrating it with other ontologies, thereby increasing the accessibility of worm phenotypes to non-nematode biologists. We are actively developing the WPO to continue to fulfill the evolving needs of the scientific community and hope to engage researchers in this crucial endeavor. CONCLUSIONS: We provide a phenotype ontology (WPO) that will help to facilitate data retrieval, and cross-species comparisons within the nematode community. In the larger scientific community, the WPO will permit data integration, and interoperability across the different Model Organism Databases (MODs) and other biological databases. This standardized phenotype ontology will therefore allow for more complex data queries and enhance bioinformatic analyses.


Subject(s)
Caenorhabditis elegans/genetics , Information Storage and Retrieval/standards , Phenotype , Terminology as Topic , Animals , Vocabulary, Controlled
3.
Nucleic Acids Res ; 38(Database issue): D463-7, 2010 Jan.
Article in English | MEDLINE | ID: mdl-19910365

ABSTRACT

WormBase (http://www.wormbase.org) is a central data repository for nematode biology. Initially created as a service to the Caenorhabditis elegans research field, WormBase has evolved into a powerful research tool in its own right. In the past 2 years, we expanded WormBase to include the complete genomic sequence, gene predictions and orthology assignments from a range of related nematodes. This comparative data enrich the C. elegans data with improved gene predictions and a better understanding of gene function. In turn, they bring the wealth of experimental knowledge of C. elegans to other systems of medical and agricultural importance. Here, we describe new species and data types now available at WormBase. In addition, we detail enhancements to our curatorial pipeline and website infrastructure to accommodate new genomes and an extensive user base.


Subject(s)
Caenorhabditis elegans/genetics , Caenorhabditis/genetics , Computational Biology/methods , Databases, Genetic , Databases, Nucleic Acid , Alleles , Animals , Computational Biology/trends , Databases, Protein , Information Storage and Retrieval/methods , Internet , Phenotype , Protein Structure, Tertiary , Software , Transcription Factors
4.
Proc Natl Acad Sci U S A ; 105(51): 20095-9, 2008 Dec 23.
Article in English | MEDLINE | ID: mdl-19104047

ABSTRACT

The Caenorhabditis elegans vulva is an elegant model for dissecting a gene regulatory network (GRN) that directs postembryonic organogenesis. The mature vulva comprises seven cell types (vulA, vulB1, vulB2, vulC, vulD, vulE, and vulF), each with its own unique pattern of spatial and temporal gene expression. The mechanisms that specify these cell types in a precise spatial pattern are not well understood. Using reverse genetic screens, we identified novel components of the vulval GRN, including nhr-113 in vulA. Several transcription factors (lin-11, lin-29, cog-1, egl-38, and nhr-67) interact with each other and act in concert to regulate target gene expression in the diverse vulval cell types. For example, egl-38 (Pax2/5/8) stabilizes the vulF fate by positively regulating vulF characteristics and by inhibiting characteristics associated with the neighboring vulE cells. nhr-67 and egl-38 regulate cog-1, helping restrict its expression to vulE. Computational approaches have been successfully used to identify functional cis-regulatory motifs in the zmp-1 (zinc metalloproteinase) promoter. These results provide an overview of the regulatory network architecture for each vulval cell type.


Subject(s)
Caenorhabditis elegans/embryology , Gene Regulatory Networks/physiology , Organogenesis/genetics , Vulva/growth & development , Animals , Caenorhabditis elegans/genetics , Embryo, Nonmammalian , Embryonic Induction , Enhancer Elements, Genetic , Female , Genes, Helminth , Metalloproteases/genetics , Transcription Factors , Vulva/embryology
5.
Nucleic Acids Res ; 36(Database issue): D612-7, 2008 Jan.
Article in English | MEDLINE | ID: mdl-17991679

ABSTRACT

WormBase (www.wormbase.org) is the major publicly available database of information about Caenorhabditis elegans, an important system for basic biological and biomedical research. Derived from the initial ACeDB database of C. elegans genetic and sequence information, WormBase now includes the genomic, anatomical and functional information about C. elegans, other Caenorhabditis species and other nematodes. As such, it is a crucial resource not only for C. elegans biologists but the larger biomedical and bioinformatics communities. Coverage of core areas of C. elegans biology will allow the biomedical community to make full use of the results of intensive molecular genetic analysis and functional genomic studies of this organism. Improved search and display tools, wider cross-species comparisons and extended ontologies are some of the features that will help scientists extend their research and take advantage of other nematode species genome sequences.


Subject(s)
Caenorhabditis elegans/genetics , Databases, Genetic , Genome, Helminth , Animals , Caenorhabditis elegans/metabolism , Chromosome Mapping , Gene Expression , Gene Regulatory Networks , Genes, Helminth , Genomics , Internet , Mass Spectrometry , Peptides/chemistry , Phenotype , User-Computer Interface
6.
PLoS Genet ; 3(4): e69, 2007 Apr 27.
Article in English | MEDLINE | ID: mdl-17465684

ABSTRACT

Regulation of spatio-temporal gene expression in diverse cell and tissue types is a critical aspect of development. Progression through Caenorhabditis elegans vulval development leads to the generation of seven distinct vulval cell types (vulA, vulB1, vulB2, vulC, vulD, vulE, and vulF), each with its own unique gene expression profile. The mechanisms that establish the precise spatial patterning of these mature cell types are largely unknown. Dissection of the gene regulatory networks involved in vulval patterning and differentiation would help us understand how cells generate a spatially defined pattern of cell fates during organogenesis. We disrupted the activity of 508 transcription factors via RNAi and assayed the expression of ceh-2, a marker for vulB fate during the L4 stage. From this screen, we identified the tailless ortholog nhr-67 as a novel regulator of gene expression in multiple vulval cell types. We find that one way in which nhr-67 maintains cell identity is by restricting inappropriate cell fusion events in specific vulval cells, namely vulE and vulF. nhr-67 exhibits a dynamic expression pattern in the vulval cells and interacts with three other transcriptional regulators cog-1 (Nkx6.1/6.2), lin-11 (LIM), and egl-38 (Pax2/5/8) to generate the composite expression patterns of their downstream targets. We provide evidence that egl-38 regulates gene expression in vulB1, vulC, vulD, vulE, as well as vulF cells. We demonstrate that the pairwise interactions between these regulatory genes are complex and vary among the seven cell types. We also discovered a striking regulatory circuit that affects a subset of the vulval lineages: cog-1 and nhr-67 inhibit both one another and themselves. We postulate that the differential levels and combinatorial patterns of lin-11, cog-1, and nhr-67 expression are a part of a regulatory code for the mature vulval cell types.


Subject(s)
Caenorhabditis elegans Proteins/physiology , Caenorhabditis elegans/genetics , Caenorhabditis elegans/metabolism , Gene Expression Regulation, Developmental , Receptors, Cytoplasmic and Nuclear/physiology , Vulva/embryology , Animals , Animals, Genetically Modified , Caenorhabditis elegans/embryology , Caenorhabditis elegans Proteins/genetics , Caenorhabditis elegans Proteins/metabolism , Cell Fusion , Cell Lineage/genetics , Embryo, Nonmammalian , Female , Gene Regulatory Networks/physiology , Homeodomain Proteins/genetics , Intercellular Signaling Peptides and Proteins/genetics , Morphogenesis/genetics , Receptors, Cytoplasmic and Nuclear/genetics , Receptors, Cytoplasmic and Nuclear/metabolism , Vulva/cytology , Vulva/metabolism
7.
Proc Natl Acad Sci U S A ; 102(14): 4972-7, 2005 Apr 05.
Article in English | MEDLINE | ID: mdl-15749820

ABSTRACT

The vulval development of Caenorhabditis elegans provides an opportunity to investigate genetic networks that control gene expression during organogenesis. During the fourth larval stage (L4), seven vulval cell types are produced, each of which executes a distinct gene expression program. We analyze how the expression of cell-type-specific genes is regulated. Ras and Wnt signaling pathways play major roles in generating the spatial pattern of cell types and regulate gene expression through a network of transcription factors. One transcription factor (lin-29) primarily controls the temporal expression pattern. Other transcription factors (lin-11, cog-1, and egl-38) act in combination to control cell-type-specific gene expression. The complexity of the network arises in part because of the dynamic nature of gene expression, in part because of the presence of seven cell types, and also because there are multiple regulatory paths for gene expression within each cell type.


Subject(s)
Caenorhabditis elegans/growth & development , Caenorhabditis elegans/genetics , Vulva/growth & development , Animals , Body Patterning/genetics , Caenorhabditis elegans/cytology , Caenorhabditis elegans/metabolism , Caenorhabditis elegans Proteins/genetics , Caenorhabditis elegans Proteins/metabolism , Cell Communication , DNA-Binding Proteins/genetics , DNA-Binding Proteins/metabolism , Female , Gene Expression Regulation, Developmental , Genes, Helminth , Genes, Homeobox , Genetic Complementation Test , Intercellular Signaling Peptides and Proteins/genetics , Intercellular Signaling Peptides and Proteins/metabolism , Signal Transduction , Transcription Factors/genetics , Transcription Factors/metabolism , Transcription, Genetic , Vulva/cytology , Vulva/metabolism
8.
Hum Mol Genet ; 12(20): 2669-78, 2003 Oct 15.
Article in English | MEDLINE | ID: mdl-12944419

ABSTRACT

A key event in the pathogenesis of Alzheimer's disease (AD) is the deposition of senile plaques consisting largely of a peptide known as beta-amyloid (Abeta) that is derived from the amyloid precursor protein (APP). A proteolytic activity called gamma-secretase cleaves APP in the transmembrane domain and is required for Abeta generation. Aberrant gamma-secretase cleavage of APP underlies the majority of early onset, familial AD. gamma-Secretase resides in a large multi-protein complex, of which Presenilin, Nicastrin, APH-1 and PEN-2 are four essential components. Thus, identifying components and pathways by which the gamma-secretase activity is regulated is crucial to understanding the mechanisms underlying AD pathogenesis, and may provide new diagnostic tools and therapeutic targets. Here we describe the generation of Drosophila that act as living reporters of gamma-secretase activity in the fly eye. In these reporter flies the size of the eye correlates with the level of endogenous gamma-secretase activity, and is very sensitive to the levels of three genes required for APP gamma-secretase activity, presenilin, nicastrin and aph-1. Thus, these flies provide a sensitized system with which to identify other components of the gamma-secretase complex and regulators of its activity. We have used these flies to carry out a screen for mutations that suppress gamma-secretase activity and have identified a small chromosomal region that contains a gene or genes whose products may promote gamma-secretase activity.


Subject(s)
Amyloid beta-Protein Precursor/genetics , Drosophila/genetics , Genes, Reporter , Amyloid Precursor Protein Secretases , Animals , Drosophila/enzymology , Drosophila/metabolism , Endopeptidases/metabolism , Microscopy, Electron, Scanning , Models, Biological , Models, Genetic , Mutation , Phenotype , Photoreceptor Cells, Invertebrate/pathology , Protein Binding , Protein Structure, Tertiary , Transgenes
SELECTION OF CITATIONS
SEARCH DETAIL
...