Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 72
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Database (Oxford) ; 20232023 02 16.
Artigo em Inglês | MEDLINE | ID: mdl-36794865

RESUMO

The power of next-generation sequencing has resulted in an explosive growth in the number of projects aiming to understand the metagenomic diversity of complex microbial environments. The interdisciplinary nature of this microbiome research community, along with the absence of reporting standards for microbiome data and samples, poses a significant challenge for follow-up studies. Commonly used names of metagenomes and metatranscriptomes in public databases currently lack the essential information necessary to accurately describe and classify the underlying samples, which makes a comparative analysis difficult to conduct and often results in misclassified sequences in data repositories. The Genomes OnLine Database (GOLD) (https:// gold.jgi.doe.gov/) at the Department of Energy Joint Genome Institute has been at the forefront of addressing this challenge by developing a standardized nomenclature system for naming microbiome samples. GOLD, currently in its twenty-fifth anniversary, continues to enrich the research community with hundreds of thousands of metagenomes and metatranscriptomes with well-curated and easy-to-understand names. Through this manuscript, we describe the overall naming process that can be easily adopted by researchers worldwide. Additionally, we propose the use of this naming system as a best practice for the scientific community to facilitate better interoperability and reusability of microbiome data.


Assuntos
Microbiota , Software , Microbiota/genética , Metagenoma/genética , Metagenômica/métodos , Gerenciamento de Dados
2.
Nucleic Acids Res ; 51(D1): D957-D963, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36318257

RESUMO

The Genomes OnLine Database (GOLD) (https://gold.jgi.doe.gov/) at the Department of Energy Joint Genome Institute (DOE-JGI) continues to maintain its role as one of the flagship genomic metadata repositories of the world. The ever-increasing number of projects and metadata are freely available to the user community world-wide. GOLD's metadata is consumed by scientists and remains an important source for large-scale comparative genomics analysis initiatives. Encouraged by this active user engagement and growth, GOLD has continued to add new components and capabilities. The new features such as a public Application Programming Interface (API) and Ecosystem landing page as well as the growth of different entities in this current GOLD v.9 edition are described in detail in this manuscript.


Assuntos
Bases de Dados Genéticas , Genômica , Genoma , Software
3.
Nucleic Acids Res ; 51(D1): D723-D732, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36382399

RESUMO

The Integrated Microbial Genomes & Microbiomes system (IMG/M: https://img.jgi.doe.gov/m/) at the Department of Energy (DOE) Joint Genome Institute (JGI) continues to provide support for users to perform comparative analysis of isolate and single cell genomes, metagenomes, and metatranscriptomes. In addition to datasets produced by the JGI, IMG v.7 also includes datasets imported from public sources such as NCBI Genbank, SRA, and the DOE National Microbiome Data Collaborative (NMDC), or submitted by external users. In the past couple years, we have continued our effort to help the user community by improving the annotation pipeline, upgrading the contents with new reference database versions, and adding new analysis functionalities such as advanced scaffold search, Average Nucleotide Identity (ANI) for high-quality metagenome bins, new cassette search, improved gene neighborhood display, and improvements to metatranscriptome data display and analysis. We also extended the collaboration and integration efforts with other DOE-funded projects such as NMDC and DOE Biology Knowledgebase (KBase).


Assuntos
Gerenciamento de Dados , Genômica , Genoma Bacteriano , Software , Genoma Arqueal , Bases de Dados Genéticas , Metagenoma
4.
Nucleic Acids Res ; 49(D1): D723-D733, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33152092

RESUMO

The Genomes OnLine Database (GOLD) (https://gold.jgi.doe.gov/) is a manually curated, daily updated collection of genome projects and their metadata accumulated from around the world. The current version of the database includes over 1.17 million entries organized broadly into Studies (45 770), Organisms (387 382) or Biosamples (101 207), Sequencing Projects (355 364) and Analysis Projects (283 481). These four levels contain over 600 metadata fields, which includes 76 controlled vocabulary (CV) tables containing 3873 terms. GOLD provides an interactive web user interface for browsing and searching by a wide range of project and metadata fields. Users can enter details about their own projects in GOLD, which acts as a gatekeeper to ensure that metadata is accurately documented before submitting sequence information to the Integrated Microbial Genomes (IMG) system for analysis. In order to maintain a reference dataset for use by members of the scientific community, GOLD also imports projects from public repositories such as GenBank and SRA. The current status of the database, along with recent updates and improvements are described in this manuscript.


Assuntos
Bases de Dados Genéticas , Genoma , Ecossistema , Ontologia Genética , Ferramenta de Busca , Análise de Sequência de DNA
5.
Microbiol Resour Announc ; 8(14)2019 Apr 04.
Artigo em Inglês | MEDLINE | ID: mdl-30948472

RESUMO

Asinibacterium sp. strains OR43 and OR53 belong to the phylum Bacteroidetes and were isolated from subsurface sediments in Oak Ridge, TN. Both strains grow at elevated levels of heavy metals. Here, we present the closed genome sequence of Asinibacterium sp. strain OR53 and the draft genome sequence of Asinibacterium sp. strain OR43.

6.
Nucleic Acids Res ; 47(D1): D649-D659, 2019 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-30357420

RESUMO

The Genomes Online Database (GOLD) (https://gold.jgi.doe.gov) is an open online resource, which maintains an up-to-date catalog of genome and metagenome projects in the context of a comprehensive list of associated metadata. Information in GOLD is organized into four levels: Study, Biosample/Organism, Sequencing Project and Analysis Project. Currently GOLD hosts information on 33 415 Studies, 49 826 Biosamples, 313 324 Organisms, 215 881 Sequencing Projects and 174 454 Analysis Projects with a total of 541 metadata fields, of which 80 are based on controlled vocabulary (CV) terms. GOLD provides a user-friendly web interface to browse sequencing projects and launch advanced search tools across four classification levels. Users submit metadata on a wide range of Sequencing and Analysis Projects in GOLD before depositing sequence data to the Integrated Microbial Genomes (IMG) system for analysis. GOLD conforms with and supports the rules set by the Genomic Standards Consortium (GSC) Minimum Information standards. The current version of GOLD (v.7) has seen the number of projects and associated metadata increase exponentially over the years. This paper provides an update on the current status of GOLD and highlights the new features added over the last two years.


Assuntos
Bases de Dados Genéticas/normas , Genômica/métodos , Software/normas , Ontologia Genética
7.
Nucleic Acids Res ; 45(D1): D446-D456, 2017 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-27794040

RESUMO

The Genomes Online Database (GOLD) (https://gold.jgi.doe.gov) is a manually curated data management system that catalogs sequencing projects with associated metadata from around the world. In the current version of GOLD (v.6), all projects are organized based on a four level classification system in the form of a Study, Organism (for isolates) or Biosample (for environmental samples), Sequencing Project and Analysis Project. Currently, GOLD provides information for 26 117 Studies, 239 100 Organisms, 15 887 Biosamples, 97 212 Sequencing Projects and 78 579 Analysis Projects. These are integrated with over 312 metadata fields from which 58 are controlled vocabularies with 2067 terms. The web interface facilitates submission of a diverse range of Sequencing Projects (such as isolate genome, single-cell genome, metagenome, metatranscriptome) and complex Analysis Projects (such as genome from metagenome, or combined assembly from multiple Sequencing Projects). GOLD provides a seamless interface with the Integrated Microbial Genomes (IMG) system and supports and promotes the Genomic Standards Consortium (GSC) Minimum Information standards. This paper describes the data updates and additional features added during the last two years.


Assuntos
Biologia Computacional/métodos , Bases de Dados de Ácidos Nucleicos , Genoma , Genômica/métodos , Mineração de Dados , Metagenoma , Metagenômica/métodos , Software , Interface Usuário-Computador
8.
Genome Announc ; 4(5)2016 Sep 08.
Artigo em Inglês | MEDLINE | ID: mdl-27609918

RESUMO

Marinobacter sp. strain MCTG268 was isolated from the cosmopolitan marine diatom Skeletonema costatum and can degrade oil hydrocarbons as sole sources of carbon and energy. Here, we present the genome sequence of this strain, which is 4,449,396 bp with 4,157 genes and an average G+C content of 57.0%.

9.
Genome Announc ; 4(4)2016 Aug 04.
Artigo em Inglês | MEDLINE | ID: mdl-27491994

RESUMO

Arenibacter algicola strain TG409 was isolated from Skeletonema costatum and exhibits the ability to utilize polycyclic aromatic hydrocarbons as sole sources of carbon and energy. Here, we present the genome sequence of this strain, which is 5,550,230 bp with 4,722 genes and an average G+C content of 39.7%.

10.
BMC Genomics ; 17: 307, 2016 Apr 26.
Artigo em Inglês | MEDLINE | ID: mdl-27118214

RESUMO

BACKGROUND: The exponential growth of genomic data from next generation technologies renders traditional manual expert curation effort unsustainable. Many genomic systems have included community annotation tools to address the problem. Most of these systems adopted a "Wiki-based" approach to take advantage of existing wiki technologies, but encountered obstacles in issues such as usability, authorship recognition, information reliability and incentive for community participation. RESULTS: Here, we present a different approach, relying on tightly integrated method rather than "Wiki-based" method, to support community annotation and user collaboration in the Integrated Microbial Genomes (IMG) system. The IMG approach allows users to use existing IMG data warehouse and analysis tools to add gene, pathway and biosynthetic cluster annotations, to analyze/reorganize contigs, genes and functions using workspace datasets, and to share private user annotations and workspace datasets with collaborators. We show that the annotation effort using IMG can be part of the research process to overcome the user incentive and authorship recognition problems thus fostering collaboration among domain experts. The usability and reliability issues are addressed by the integration of curated information and analysis tools in IMG, together with DOE Joint Genome Institute (JGI) expert review. CONCLUSION: By incorporating annotation operations into IMG, we provide an integrated environment for users to perform deeper and extended data analysis and annotation in a single system that can lead to publications and community knowledge sharing as shown in the case studies.


Assuntos
Biologia Computacional/métodos , Genoma Microbiano , Genômica/métodos , Anotação de Sequência Molecular/métodos , Software , Comportamento Cooperativo , Confiabilidade dos Dados , Disseminação de Informação , Internet , Interface Usuário-Computador
11.
Stand Genomic Sci ; 10: 45, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26380633

RESUMO

BACKGROUND: In an effort to identify the best practice for finding genes in prokaryotic genomes and propose it as a standard for automated annotation pipelines, 1,004,576 peptides were collected from various publicly available resources, and were used as a basis to evaluate various gene-calling methods. The peptides came from 45 bacterial replicons with an average GC content from 31 % to 74 %, biased toward higher GC content genomes. Automated, manual, and semi-manual methods were used to tally errors in three widely used gene calling methods, as evidenced by peptides mapped outside the boundaries of called genes. RESULTS: We found that the consensus set of identical genes predicted by the three methods constitutes only about 70 % of the genes predicted by each individual method (with start and stop required to coincide). Peptide data was useful for evaluating some of the differences between gene callers, but not reliable enough to make the results conclusive, due to limitations inherent in any proteogenomic study. CONCLUSIONS: A single, unambiguous, unanimous best practice did not emerge from this analysis, since the available proteomics data were not adequate to provide an objective measurement of differences in the accuracy between these methods. However, as a result of this study, software, reference data, and procedures have been better matched among participants, representing a step toward a much-needed standard. In the absence of sufficient amount of exprimental data to achieve a universal standard, our recommendation is that any of these methods can be used by the community, as long as a single method is employed across all datasets to be compared.

12.
Genome Announc ; 3(4)2015 Aug 06.
Artigo em Inglês | MEDLINE | ID: mdl-26251504

RESUMO

Frankia sp. strain DC12, isolated from root nodules of Datisca cannabina, is a member of the fourth lineage of Frankia, which is unable to reinfect actinorhizal plants. Here, we report its 6.88-Mbp high-quality draft genome sequence, with a G+C content of 71.92% and 5,858 candidate protein-coding genes.

13.
Genome Announc ; 3(4)2015 Jul 16.
Artigo em Inglês | MEDLINE | ID: mdl-26184945

RESUMO

Halomonas sp. strain MCTG39a was isolated from coastal sea surface water based on its ability to utilize n-hexadecane. During growth in marine medium the strain produces an amphiphilic exopolymeric substance (EPS) amended with glucose, which emulsifies a variety of oil hydrocarbon substrates. Here, we present the genome sequence of this strain, which is 4,979,193 bp with 4,614 genes and an average G+C content of 55.0%.

14.
Genome Announc ; 3(3)2015 Jun 18.
Artigo em Inglês | MEDLINE | ID: mdl-26089431

RESUMO

Porticoccus hydrocarbonoclasticus strain MCTG13d is a recently discovered bacterium that is associated with marine eukaryotic phytoplankton and that almost exclusively utilizes polycyclic aromatic hydrocarbons (PAHs) as the sole source of carbon and energy. Here, we present the genome sequence of this strain, which is 2,474,654 bp with 2,385 genes and has an average G+C content of 53.1%.

15.
Genome Announc ; 3(3)2015 May 14.
Artigo em Inglês | MEDLINE | ID: mdl-25977428

RESUMO

The genus Caldicellulosiruptor contains extremely thermophilic, cellulolytic bacteria capable of lignocellulose deconstruction. Currently, complete genome sequences for eleven Caldicellulosiruptor species are available. Here, we report genome sequences for three additional Caldicellulosiruptor species: Rt8.B8 DSM 8990 (New Zealand), Wai35.B1 DSM 8977 (New Zealand), and "Thermoanaerobacter cellulolyticus" strain NA10 DSM 8991 (Japan).

16.
Genome Announc ; 3(2)2015 Mar 26.
Artigo em Inglês | MEDLINE | ID: mdl-25814607

RESUMO

Polycyclovorans algicola strain TG408 is a recently discovered bacterium associated with marine eukaryotic phytoplankton and exhibits the ability to utilize polycyclic aromatic hydrocarbons (PAHs) almost exclusively as sole sources of carbon and energy. Here, we present the genome sequence of this strain, which is 3,653,213 bp, with 3,477 genes and an average G+C content of 63.8%.

17.
Genome Announc ; 2(5)2014 Oct 16.
Artigo em Inglês | MEDLINE | ID: mdl-25323723

RESUMO

High-quality draft genome sequences were determined for 10 Exiguobacterium strains in order to provide insight into their evolutionary strategies for speciation and environmental adaptation. The selected genomes include psychrotrophic and thermophilic species from a range of habitats, which will allow for a comparison of metabolic pathways and stress response genes.

18.
Stand Genomic Sci ; 9(3): 1089-104, 2014 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-25197485

RESUMO

Clostridium indolis DSM 755(T) is a bacterium commonly found in soils and the feces of birds and mammals. Despite its prevalence, little is known about the ecology or physiology of this species. However, close relatives, C. saccharolyticum and C. hathewayi, have demonstrated interesting metabolic potentials related to plant degradation and human health. The genome of C. indolis DSM 755(T) reveals an abundance of genes in functional groups associated with the transport and utilization of carbohydrates, as well as citrate, lactate, and aromatics. Ecologically relevant gene clusters related to nitrogen fixation and a unique type of bacterial microcompartment, the CoAT BMC, are also detected. Our genome analysis suggests hypotheses to be tested in future culture based work to better understand the physiology of this poorly described species.

19.
Genome Announc ; 1(5)2013 Oct 24.
Artigo em Inglês | MEDLINE | ID: mdl-24158554

RESUMO

Variovorax paradoxus is a ubiquitous betaproteobacterium involved in plant growth promotion, the degradation of xenobiotics, and quorum-quenching activity. The genome of V. paradoxus strain EPS consists of a single circular chromosome of 6,550,056 bp, with a 66.48% G+C content.

20.
Stand Genomic Sci ; 7(3): 469-82, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24019993

RESUMO

Nitrosomonas sp. Is79 is a chemolithoautotrophic ammonia-oxidizing bacterium that belongs to the family Nitrosomonadaceae within the phylum Proteobacteria. Ammonia oxidation is the first step of nitrification, an important process in the global nitrogen cycle ultimately resulting in the production of nitrate. Nitrosomonas sp. Is79 is an ammonia oxidizer of high interest because it is adapted to low ammonium and can be found in freshwater environments around the world. The 3,783,444-bp chromosome with a total of 3,553 protein coding genes and 44 RNA genes was sequenced by the DOE-Joint Genome Institute Program CSP 2006.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...