Pesquisa | Portal Regional da BVS

The proteogenomic mapping tool.

Sanders, William S; Wang, Nan; Bridges, Susan M; Malone, Brandon M; Dandass, Yoginder S; McCarthy, Fiona M; Nanduri, Bindu; Lawrence, Mark L; Burgess, Shane C.

BMC Bioinformatics ; 12: 115, 2011 Apr 22.

Artigo em Inglês | MEDLINE | ID: mdl-21513508

RESUMO

BACKGROUND: High-throughput mass spectrometry (MS) proteomics data is increasingly being used to complement traditional structural genome annotation methods. To keep pace with the high speed of experimental data generation and to aid in structural genome annotation, experimentally observed peptides need to be mapped back to their source genome location quickly and exactly. Previously, the tools to do this have been limited to custom scripts designed by individual research groups to analyze their own data, are generally not widely available, and do not scale well with large eukaryotic genomes. RESULTS: The Proteogenomic Mapping Tool includes a Java implementation of the Aho-Corasick string searching algorithm which takes as input standardized file types and rapidly searches experimentally observed peptides against a given genome translated in all 6 reading frames for exact matches. The Java implementation allows the application to scale well with larger eukaryotic genomes while providing cross-platform functionality. CONCLUSIONS: The Proteogenomic Mapping Tool provides a standalone application for mapping peptides back to their source genome on a number of operating system platforms with standard desktop computer hardware and executes very rapidly for a variety of datasets. Allowing the selection of different genetic codes for different organisms allows researchers to easily customize the tool to their own research interests and is recommended for anyone working to structurally annotate genomes using MS derived proteomics data.

Assuntos

Anotação de Sequência Molecular/métodos , Peptídeos/genética , Algoritmos , Códon , Genômica/métodos , Espectrometria de Massas/métodos , Biossíntese de Proteínas , Proteômica/métodos , Software

Complete genome and comparative analysis of the chemolithoautotrophic bacterium Oligotropha carboxidovorans OM5.

Paul, Debarati; Bridges, Susan M; Burgess, Shane C; Dandass, Yoginder S; Lawrence, Mark L.

BMC Genomics ; 11: 511, 2010 Sep 23.

Artigo em Inglês | MEDLINE | ID: mdl-20863402

RESUMO

BACKGROUND: Oligotropha carboxidovorans OM5 T. (DSM 1227, ATCC 49405) is a chemolithoautotrophic bacterium capable of utilizing CO (carbon monoxide) and fixing CO2 (carbon dioxide). We previously published the draft genome of this organism and recently submitted the complete genome sequence to GenBank. RESULTS: The genome sequence of the chemolithoautotrophic bacterium Oligotropha carboxidovorans OM5 consists of a 3.74-Mb chromosome and a 133-kb megaplasmid that contains the genes responsible for utilization of carbon monoxide, carbon dioxide, and hydrogen. To our knowledge, this strain is the first one to be sequenced in the genus Oligotropha, the closest fully sequenced relatives being Bradyrhizobium sp. BTAi and USDA110 and Nitrobacter hamburgiensis X14. Analysis of the O. carboxidovorans genome reveals potential links between plasmid-encoded chemolithoautotrophy and chromosomally-encoded lipid metabolism. Comparative analysis of O. carboxidovorans with closely related species revealed differences in metabolic pathways, particularly in carbohydrate and lipid metabolism, as well as transport pathways. CONCLUSION: Oligotropha, Bradyrhizobium sp and Nitrobacter hamburgiensis X14 are phylogenetically proximal. Although there is significant conservation of genome organization between the species, there are major differences in many metabolic pathways that reflect the adaptive strategies unique to each species.

Assuntos

Bradyrhizobiaceae/genética , Crescimento Quimioautotrófico/genética , Genoma Bacteriano/genética , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Bradyrhizobiaceae/enzimologia , Metabolismo dos Carboidratos/genética , DNA Circular/genética , Metabolismo Energético/genética , Herança Extracromossômica/genética , Ácidos Graxos/biossíntese , Genômica , Sequências Repetitivas Dispersas/genética , Redes e Vias Metabólicas/genética , Oxirredução , Filogenia , Ligação Proteica , Transporte Proteico , RNA Ribossômico 16S/genética , RNA não Traduzido/genética , Homologia de Sequência de Aminoácidos , Sintenia/genética

Genome sequence of the solvent-producing bacterium Clostridium carboxidivorans strain P7T.

Paul, Debarati; Austin, Frank W; Arick, Tony; Bridges, Susan M; Burgess, Shane C; Dandass, Yoginder S; Lawrence, Mark L.

J Bacteriol ; 192(20): 5554-5, 2010 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-20729368

RESUMO

Clostridium carboxidivorans strain P7(T) is a strictly anaerobic acetogenic bacterium that produces acetate, ethanol, butanol, and butyrate. The C. carboxidivorans genome contains all the genes for the carbonyl branch of the Wood-Ljungdahl pathway for CO(2) fixation, and it encodes enzymes for conversion of acetyl coenzyme A into butanol and butyrate.

Assuntos

Clostridium/genética , Genoma Bacteriano , Clostridium/classificação , DNA Bacteriano/genética , Dados de Sequência Molecular

Accelerating string set matching in FPGA hardware for bioinformatics research.

Dandass, Yoginder S; Burgess, Shane C; Lawrence, Mark; Bridges, Susan M.

BMC Bioinformatics ; 9: 197, 2008 Apr 15.

Artigo em Inglês | MEDLINE | ID: mdl-18412963

RESUMO

BACKGROUND: This paper describes techniques for accelerating the performance of the string set matching problem with particular emphasis on applications in computational proteomics. The process of matching peptide sequences against a genome translated in six reading frames is part of a proteogenomic mapping pipeline that is used as a case-study. The Aho-Corasick algorithm is adapted for execution in field programmable gate array (FPGA) devices in a manner that optimizes space and performance. In this approach, the traditional Aho-Corasick finite state machine (FSM) is split into smaller FSMs, operating in parallel, each of which matches up to 20 peptides in the input translated genome. Each of the smaller FSMs is further divided into five simpler FSMs such that each simple FSM operates on a single bit position in the input (five bits are sufficient for representing all amino acids and special symbols in protein sequences). RESULTS: This bit-split organization of the Aho-Corasick implementation enables efficient utilization of the limited random access memory (RAM) resources available in typical FPGAs. The use of on-chip RAM as opposed to FPGA logic resources for FSM implementation also enables rapid reconfiguration of the FPGA without the place and routing delays associated with complex digital designs. CONCLUSION: Experimental results show storage efficiencies of over 80% for several data sets. Furthermore, the FPGA implementation executing at 100 MHz is nearly 20 times faster than an implementation of the traditional Aho-Corasick algorithm executing on a 2.67 GHz workstation.

Assuntos

Computadores , Redes Neurais de Computação , Proteômica/instrumentação , Algoritmos , Desenho de Equipamento , Perfilação da Expressão Gênica/instrumentação , Perfilação da Expressão Gênica/métodos , Armazenamento e Recuperação da Informação/métodos , Lógica , Modelos Teóricos , Fases de Leitura Aberta , Proteoma/análise , Proteômica/métodos , Alinhamento de Sequência/instrumentação , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína/instrumentação , Análise de Sequência de Proteína/métodos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA