Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros











Base de dados
Intervalo de ano de publicação
1.
Nat Commun ; 14(1): 919, 2023 02 17.
Artigo em Inglês | MEDLINE | ID: mdl-36808136

RESUMO

Cohort-wide sequencing studies have revealed that the largest category of variants is those deemed 'rare', even for the subset located in coding regions (99% of known coding variants are seen in less than 1% of the population. Associative methods give some understanding how rare genetic variants influence disease and organism-level phenotypes. But here we show that additional discoveries can be made through a knowledge-based approach using protein domains and ontologies (function and phenotype) that considers all coding variants regardless of allele frequency. We describe an ab initio, genetics-first method making molecular knowledge-based interpretations for exome-wide non-synonymous variants for phenotypes at the organism and cellular level. By using this reverse approach, we identify plausible genetic causes for developmental disorders that have eluded other established methods and present molecular hypotheses for the causal genetics of 40 phenotypes generated from a direct-to-consumer genotype cohort. This system offers a chance to extract further discovery from genetic data after standard tools have been applied.


Assuntos
Exoma , Predisposição Genética para Doença , Humanos , Fenótipo , Genótipo , Frequência do Gene
2.
Nucleic Acids Res ; 43(Database issue): D227-33, 2015 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-25414345

RESUMO

We present updates to the SUPERFAMILY 1.75 (http://supfam.org) online resource and protein sequence collection. The hidden Markov model library that provides sequence homology to SCOP structural domains remains unchanged at version 1.75. In the last 4 years SUPERFAMILY has more than doubled its holding of curated complete proteomes over all cellular life, from 1400 proteomes reported previously in 2010 up to 3258 at present. Outside of the main sequence collection, SUPERFAMILY continues to provide domain annotation for sequences provided by other resources such as: UniProt, Ensembl, PDB, much of JGI Phytozome and selected subcollections of NCBI RefSeq. Despite this growth in data volume, SUPERFAMILY now provides users with an expanded and daily updated phylogenetic tree of life (sTOL). This tree is built with genomic-scale domain annotation data as before, but constantly updated when new species are introduced to the sequence library. Our Gene Ontology and other functional and phenotypic annotations previously reported have stood up to critical assessment by the function prediction community. We have now introduced these data in an integrated manner online at the level of an individual sequence, and--in the case of whole genomes--with enrichment analysis against a taxonomically defined background.


Assuntos
Bases de Dados de Proteínas , Estrutura Terciária de Proteína , Ontologia Genética , Anotação de Sequência Molecular , Filogenia , Proteínas/classificação , Proteínas/genética , Proteoma/química , Análise de Sequência de Proteína
3.
Mol Biol Evol ; 31(6): 1364-74, 2014 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-24692656

RESUMO

Humans are composed of hundreds of cell types. As the genomic DNA of each somatic cell is identical, cell type is determined by what is expressed and when. Until recently, little has been reported about the determinants of human cell identity, particularly from the joint perspective of gene evolution and expression. Here, we chart the evolutionary past of all documented human cell types via the collective histories of proteins, the principal product of gene expression. FANTOM5 data provide cell-type-specific digital expression of human protein-coding genes and the SUPERFAMILY resource is used to provide protein domain annotation. The evolutionary epoch in which each protein was created is inferred by comparison with domain annotation of all other completely sequenced genomes. Studying the distribution across epochs of genes expressed in each cell type reveals insights into human cellular evolution in terms of protein innovation. For each cell type, its history of protein innovation is charted based on the genes it expresses. Combining the histories of all cell types enables us to create a timeline of cell evolution. This timeline identifies the possibility that our common ancestor Coelomata (cavity-forming animals) provided the innovation required for the innate immune system, whereas cells which now form the brain of human have followed a trajectory of continually accumulating novel proteins since Opisthokonta (boundary of animals and fungi). We conclude that exaptation of existing domain architectures into new contexts is the dominant source of cell-type-specific domain architectures.


Assuntos
Evolução Molecular , Filogenia , Proteínas/química , Proteínas/genética , Células Eucarióticas , Humanos , Imunidade Inata , Estrutura Terciária de Proteína , Análise de Sequência de Proteína , Transcriptoma
4.
Sci Rep ; 3: 2015, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23778980

RESUMO

We report a daily-updated sequenced/species Tree Of Life (sTOL) as a reference for the increasing number of cellular organisms with their genomes sequenced. The sTOL builds on a likelihood-based weight calibration algorithm to consolidate NCBI taxonomy information in concert with unbiased sampling of molecular characters from whole genomes of all sequenced organisms. Via quantifying the extent of agreement between taxonomic and molecular data, we observe there are many potential improvements that can be made to the status quo classification, particularly in the Fungi kingdom; we also see that the current state of many animal genomes is rather poor. To augment the use of sTOL in providing evolutionary contexts, we integrate an ontology infrastructure and demonstrate its utility for evolutionary understanding on: nuclear receptors, stem cells and eukaryotic genomes. The sTOL (http://supfam.org/SUPERFAMILY/sTOL) provides a binary tree of (sequenced) life, and contributes to an analytical platform linking genome evolution, function and phenotype.


Assuntos
Bases de Dados Genéticas , Genoma , Genômica , Filogenia , Animais , Biologia Computacional/métodos , Bases de Dados Genéticas/normas , Genômica/métodos , Genômica/normas , Internet
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA