Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
Add more filters










Database
Language
Publication year range
1.
BMC Genomics ; 10: 38, 2009 Jan 21.
Article in English | MEDLINE | ID: mdl-19159464

ABSTRACT

BACKGROUND: High throughput proteomics experiments are useful for analyzing the protein expression of an organism, identifying the correct gene structure of a genome, or locating possible post-translational modifications within proteins. High throughput methods necessitate publicly accessible and easily queried databases for efficiently and logically storing, displaying, and analyzing the large volume of data. DESCRIPTION: EPICDB is a publicly accessible, queryable, relational database that organizes and displays experimental, high throughput proteomics data for Toxoplasma gondii and Cryptosporidium parvum. Along with detailed information on mass spectrometry experiments, the database also provides antibody experimental results and analysis of functional annotations, comparative genomics, and aligned expressed sequence tag (EST) and genomic open reading frame (ORF) sequences. The database contains all available alternative gene datasets for each organism, which comprises a complete theoretical proteome for the respective organism, and all data is referenced to these sequences. The database is structured around clusters of protein sequences, which allows for the evaluation of redundancy, protein prediction discrepancies, and possible splice variants. The database can be expanded to include genomes of other organisms for which proteome-wide experimental data are available. CONCLUSION: EPICDB is a comprehensive database of genome-wide T. gondii and C. parvum proteomics data and incorporates many features that allow for the analysis of the entire proteomes and/or annotation of specific protein sequences. EPICDB is complementary to other -genomics- databases of these organisms by offering complete mass spectrometry analysis on a comprehensive set of all available protein sequences.


Subject(s)
Cryptosporidium parvum/genetics , Databases, Protein , Proteomics , Toxoplasma/genetics , Animals , Database Management Systems , Expressed Sequence Tags , Genome, Protozoan , Open Reading Frames , Proteome/genetics , User-Computer Interface
2.
J Struct Funct Genomics ; 10(1): 95-9, 2009 Mar.
Article in English | MEDLINE | ID: mdl-18985440

ABSTRACT

Improvements in comparative protein structure modeling for the remote target-template sequence similarity cases are possible through the optimal combination of multiple template structures and by improving the quality of target-template alignment. Recently developed MMM and M4T methods were designed to address these problems. Here we describe new developments in both the alignment generation and the template selection parts of the modeling algorithms. We set up a new scoring function in MMM to deliver more accurate target-template alignments. This was achieved by developing and incorporating into the composite scoring function a novel statistical pairwise potential that combines local and non-local terms. The non-local term of the statistical potential utilizes a shuffled reference state definition that helped to eliminate most of the false positive signal from the background distribution of pairwise contacts. The accuracy of the scoring function was further increased by using BLOSUM mutation table scores.


Subject(s)
Algorithms , Proteins/chemistry , Sequence Alignment/methods , Computational Biology/methods , Databases, Protein , Models, Molecular , Protein Conformation , Sequence Analysis, Protein/methods
3.
PLoS One ; 3(12): e3899, 2008.
Article in English | MEDLINE | ID: mdl-19065262

ABSTRACT

BACKGROUND: Toxoplasma gondii is an obligate intracellular protozoan that infects 20 to 90% of the population. It can cause both acute and chronic infections, many of which are asymptomatic, and, in immunocompromised hosts, can cause fatal infection due to reactivation from an asymptomatic chronic infection. An essential step towards understanding molecular mechanisms controlling transitions between the various life stages and identifying candidate drug targets is to accurately characterize the T. gondii proteome. METHODOLOGY/PRINCIPAL FINDINGS: We have explored the proteome of T. gondii tachyzoites with high throughput proteomics experiments and by comparison to publicly available cDNA sequence data. Mass spectrometry analysis validated 2,477 gene coding regions with 6,438 possible alternative gene predictions; approximately one third of the T. gondii proteome. The proteomics survey identified 609 proteins that are unique to Toxoplasma as compared to any known species including other Apicomplexan. Computational analysis identified 787 cases of possible gene duplication events and located at least 6,089 gene coding regions. Commonly used gene prediction algorithms produce very disparate sets of protein sequences, with pairwise overlaps ranging from 1.4% to 12%. Through this experimental and computational exercise we benchmarked gene prediction methods and observed false negative rates of 31 to 43%. CONCLUSIONS/SIGNIFICANCE: This study not only provides the largest proteomics exploration of the T. gondii proteome, but illustrates how high throughput proteomics experiments can elucidate correct gene structures in genomes.


Subject(s)
Computational Biology , Genes, Protozoan/genetics , Toxoplasma/genetics , Algorithms , Amino Acid Sequence , Animals , Cluster Analysis , Databases, Genetic , Expressed Sequence Tags , Genome/genetics , Molecular Sequence Data , Peptides/analysis , Peptides/chemistry , Proteome , Proteomics , Protozoan Proteins/analysis , Protozoan Proteins/chemistry , Protozoan Proteins/genetics , Reproducibility of Results , Sequence Homology, Amino Acid
4.
Bioinformatics ; 23(19): 2558-65, 2007 Oct 01.
Article in English | MEDLINE | ID: mdl-17823132

ABSTRACT

MOTIVATION: Two major bottlenecks in advancing comparative protein structure modeling are the efficient combination of multiple template structures and the generation of a correct input target-template alignment. RESULTS: A novel method, Multiple Mapping Method with Multiple Templates (M4T) is introduced that implements an algorithm to automatically select and combine Multiple Template structures (MT) and an alignment optimization protocol (Multiple Mapping Method, MMM). The MT module of M4T selects and combines multiple template structures through an iterative clustering approach that takes into account the 'unique' contribution of each template, their sequence similarity among themselves and to the target sequence, and their experimental resolution. MMM is a sequence-to-structure alignment method that optimally combines alternatively aligned regions according to their fit in the structural environment of the template structure. The resulting M4T alignment is used as input to a comparative modeling module. The performance of M4T has been benchmarked on CASP6 comparative modeling target sequences and on a larger independent test set, and showed favorable performance to current state of the art methods.


Subject(s)
Algorithms , Cluster Analysis , Models, Chemical , Pattern Recognition, Automated/methods , Proteins/chemistry , Sequence Alignment/methods , Sequence Analysis, Protein/methods , Computer Simulation , Models, Molecular , Proteins/ultrastructure , Sequence Homology, Amino Acid
5.
Nucleic Acids Res ; 35(Web Server issue): W363-8, 2007 Jul.
Article in English | MEDLINE | ID: mdl-17517764

ABSTRACT

Multiple Mapping Method with Multiple Templates (M4T) (http://www.fiserlab.org/servers/m4t) is a fully automated comparative protein structure modeling server. The novelty of M4T resides in two of its major modules, Multiple Templates (MT) and Multiple Mapping Method (MMM). The MT module of M4T selects and optimally combines the sequences of multiple template structures through an iterative clustering approach that takes into account the 'unique' contribution of each template, its sequence similarity to other template sequences and to the target sequences, and the quality of its experimental resolution. MMM module is a sequence-to-structure alignment method that is aimed at improving the alignment accuracy, especially at lower sequence identity levels. The current implementation of MMM takes inputs from three profile-to-profile-based alignment methods and iteratively compares and ranks alternatively aligned regions according to their fit in the structural environment of the template structure. The performance of M4T was benchmarked on CASP6 comparative modeling target sequences and on a larger independent test set and showed a favorable performance to current state-of-the-art methods.


Subject(s)
Algorithms , Computational Biology/methods , Models, Molecular , Proteins/chemistry , Sequence Analysis, Protein/methods , Software , Amino Acid Sequence , Computer Simulation , Internet , Models, Biological , Models, Theoretical , Molecular Sequence Data , Protein Structure, Secondary , Sequence Alignment
6.
Bioinformatics ; 22(21): 2691-2, 2006 Nov 01.
Article in English | MEDLINE | ID: mdl-16928737

ABSTRACT

MOTIVATION: Accurate alignment of a target sequence to a template structure continues to be a bottleneck in producing good quality comparative protein structure models. RESULTS: Multiple Mapping Method (MMM) is a comparative protein structure modeling server with an emphasis on a novel alignment optimization protocol. MMM takes inputs from five profile-to-profile based alignment methods. The alternatively aligned regions from the input alignment set are combined according to their fit in the structural environment of the template structure. The resulting, optimally spliced MMM alignment is used as input to an automated comparative modeling module to produce a full atom model. AVAILABILITY: The MMM server is freely accessible at http://www.fiserlab.org/servers/mmm


Subject(s)
Algorithms , Models, Molecular , Sequence Analysis, Protein/methods , Software , Amino Acid Sequence , Computer Simulation , Internet , Molecular Sequence Data , Protein Structure, Secondary
SELECTION OF CITATIONS
SEARCH DETAIL
...