Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Genome Res ; 13(4): 693-702, 2003 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-12671005

RESUMO

With the increasing amount of available genome sequences, novel tools are needed for comprehensive analysis of species-specific sequence characteristics for a wide variety of genomes. We used an unsupervised neural network algorithm, a self-organizing map (SOM), to analyze di-, tri-, and tetranucleotide frequencies in a wide variety of prokaryotic and eukaryotic genomes. The SOM, which can cluster complex data efficiently, was shown to be an excellent tool for analyzing global characteristics of genome sequences and for revealing key combinations of oligonucleotides representing individual genomes. From analysis of 1- and 10-kb genomic sequences derived from 65 bacteria (a total of 170 Mb) and from 6 eukaryotes (460 Mb), clear species-specific separations of major portions of the sequences were obtained with the di-, tri-, and tetranucleotide SOMs. The unsupervised algorithm could recognize, in most 10-kb sequences, the species-specific characteristics (key combinations of oligonucleotide frequencies) that are signature features of each genome. We were able to classify DNA sequences within one and between many species into subgroups that corresponded generally to biological categories. Because the classification power is very high, the SOM is an efficient and fundamental bioinformatic strategy for extracting a wide range of genomic information from a vast amount of sequences.


Assuntos
Biologia Computacional/métodos , Biologia Computacional/estatística & dados numéricos , Genoma , Animais , Composição de Bases/genética , Mapeamento Cromossômico/estatística & dados numéricos , Análise por Conglomerados , Evolução Molecular , Genoma Arqueal , Genoma Bacteriano , Genoma Fúngico , Genoma Humano , Genoma de Planta , Genômica/métodos , Genômica/estatística & dados numéricos , Humanos , Sequências Repetitivas Dispersas/genética , Oligonucleotídeos/genética , Filogenia , Especificidade da Espécie
2.
Hum Mol Genet ; 11(1): 13-21, 2002 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-11772995

RESUMO

The completion of the human genome sequence will greatly accelerate development of a new branch of bioscience and provide fundamental knowledge to biomedical research. We used the sequence information to measure replication timing of the entire lengths of human chromosomes 11q and 21q. Megabase-sized zones that replicate early or late in S phase (thus early/late transition) were defined at the sequence level. Early zones were more GC-rich and gene-rich than were late zones, and early/late transitions occurred primarily at positions identical to or near GC% transitions. We also found the single nucleotide polymorphism (SNP) frequency was high in the late-replicating and replication-transition regions. In the early/late transition regions, concentrated occurrence of cancer-related genes that include CCND1 encoding cyclin D1 (BCL1), FGF4 (KFGF), TIAM1 and FLI1, was observed. The transition regions contained other disease-related genes including APP associated with familial Alzheimer's disease (AD1), SOD1 associated with familial amyotrophic lateral sclerosis (ALS1) and PTS associated with phenylketonuria. These findings are discussed with respect to the prediction that increased DNA damage occurs in replication-transition regions. We propose that genome-wide assessment of replication timing serves as an efficient strategy for identifying disease-related genes.


Assuntos
Doença de Alzheimer/genética , Cromossomos Humanos Par 11/genética , Cromossomos Humanos Par 21/genética , Replicação do DNA/genética , Neoplasias/genética , Fase S/genética , Mapeamento Cromossômico , Citosina , Genes Supressores de Tumor , Genoma Humano , Guanosina , Humanos , Oncogenes , Reação em Cadeia da Polimerase , Análise de Sequência de DNA , Fatores de Tempo
3.
Genome Inform ; 13: 12-20, 2002.
Artigo em Inglês | MEDLINE | ID: mdl-14571370

RESUMO

With the increasing amount of available genome sequences, novel tools are needed for comprehensive analysis of species-specific sequence characteristics for a wide variety of genomes. We used an unsupervised neural network algorithm, Kohonen's self-organizing map (SOM), to analyze di- and trinucleotide frequencies in 9 eukaryotic genomes of known sequences (a total of 1.2 Gb); S. cerevisiae, S. pombe, C. elegans, A. thaliana, D. melanogaster, Fugu, and rice, as well as P. falciparum chromosomes 2 and 3, and human chromosomes 14, 20, 21, and 22, that have been almost completely sequenced. Each genomic sequence with different window sizes was encoded as a 16- and 64-dimensional vector giving relative frequencies of di- and trinucleotides, respectively. From analysis of a total of 120,000 nonoverlapping 10-kb sequences and overlapping 100-kb sequences with a moving step size of 10 kb, derived from a total of the 1.2 Gb genomic sequences, clear species-specific separations of most sequences were obtained with the SOMs. The unsupervised algorithm could recognize, in most of the 120,000 10-kb sequences, the species-specific characteristics (key combinations of oligonucleotide frequencies) that are signature representations of each genome. Because the classification power is very high, the SOMs can provide fundamental bioinformatic strategies for extracting a wide range of genomic information that could not otherwise be obtained.


Assuntos
Biologia Computacional/métodos , Análise de Sequência de DNA/métodos , Animais , Mapeamento Cromossômico/métodos , Interpretação Estatística de Dados , Humanos , Oligonucleotídeos/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...