Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 21
Filtrar
Más filtros











Base de datos
Intervalo de año de publicación
1.
Genome Res ; 11(11): 1952-7, 2001 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-11691860

RESUMEN

We have developed a computer program that aligns spliced sequences to genomic sequences, using local alignment algorithms and heuristics to put together a global spliced alignment. Spidey can produce reliable alignments quickly, even when confronted with noise from alternative splicing, polymorphisms, sequencing errors, or evolutionary divergence. We show how Spidey was used to align reference sequences to known genomic sequences and then to the draft human genome, to align mRNAs to gene clusters, and to align mouse mRNAs to human genomic sequence. We compared Spidey to two other spliced alignment programs; Spidey generally performed quite well in a very reasonable amount of time.


Asunto(s)
Algoritmos , ADN/genética , Empalme del ARN , ARN Mensajero/genética , Alineación de Secuencia/métodos , Programas Informáticos , Animales , Genoma Humano , Humanos , Ratones , Familia de Multigenes/genética , Especificidad de la Especie
3.
Nucleic Acids Res ; 28(1): 15-8, 2000 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-10592170

RESUMEN

The GenBank((R))sequence database incorporates publicly available DNA sequences of >55 000 different organisms, primarily through direct submission of sequence data from individual laboratories and large-scale sequencing projects. Most submissions are made using the BankIt (Web) or Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Data exchange with the EMBL Data Library and the DNA Data Bank of Japan helps ensure comprehensive worldwide coverage. GenBank data is accessible through NCBI's integrated retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping and protein structure information, plus the biomedical literature via PubMed. Sequence similarity searching is provided by the BLAST family of programs. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. NCBI also offers a wide range of WWW retrieval and analysis services based on GenBank data. The GenBank database and related resources are freely accessible via the NCBI home page at http://www.ncbi.nlm.nih.gov


Asunto(s)
Bases de Datos Factuales , Animales , ADN/química , ADN/genética , Etiquetas de Secuencia Expresada , Genoma , Humanos , National Library of Medicine (U.S.) , Lugares Marcados de Secuencia , Estados Unidos
4.
Bioinformatics ; 15(7-8): 536-43, 1999.
Artículo en Inglés | MEDLINE | ID: mdl-10487861

RESUMEN

MOTIVATION: The large amount of genome sequence data now publicly available can be accessed through the National Center for Biotechnology Information (NCBI) Entrez search and retrieval system, making it possible to explore data of a breadth and scope exceeding traditional flatfile views. RESULTS: Here we report recent improvements for completely sequenced genomes from viruses, bacteria, and yeast. Flexible web based views, precomputed relationships, and immediate access to analytical tools provide scientists with a portal into the new insights to be gained from completed genome sequences. AVAILABILITY: Entrez Genomes can be accessed on the World Wide Web at http://www.ncbi.nlm.nih.gov/Entrez/Genome/ org.html.


Asunto(s)
Bases de Datos Factuales , Genoma , Internet , Secuencia de Aminoácidos , Cromosomas/genética , Genoma Arqueal , Genoma Bacteriano , Genoma Fúngico , Genoma Viral , Datos de Secuencia Molecular , National Library of Medicine (U.S.) , Sistemas de Lectura Abierta , Proteínas/clasificación , Proteínas/genética , Alineación de Secuencia , Estados Unidos
5.
Genome Res ; 9(1): 91-8, 1999 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-9927488

RESUMEN

KARIBIN () is a karyotypic region-based integrated information resource that provides a comprehensive view of the integrated mapping and sequencing data for the human genome. A cytogenetic band is linked to a genetic or physical location using fluorescence in situ hybridization (FISH) mapping data. The genetic, physical mapping data and the sequencing data are integrated using STS markers positioned on multiple maps. For each cytogenetic band, the user can obtain the most up-to-date information that includes genetic and physical maps, human transcript gene map, YAC and PAC/BAC clone coverage, disease gene phenotype, and high throughput genomic sequences from the major human genome sequencing centers. This information provides a framework for future experiments and may accelerate the process of disease gene hunting. It is envisioned that other cytogenetic-based information such as chromosome aberrations can be linked to this framework.


Asunto(s)
Bandeo Cromosómico , Bases de Datos Factuales , Biblioteca Genómica , Sistemas en Línea , Cromosomas Humanos Par 7 , Clonación Molecular , Genoma Humano , Humanos , Internet , Cariotipificación , Lugares Marcados de Secuencia
6.
Nucleic Acids Res ; 27(1): 12-7, 1999 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-9847132

RESUMEN

The GenBank (Registered Trademark symbol) sequence database incorporates DNA sequences from all available public sources, primarily through the direct submission of sequence data from individual laboratories and from large-scale sequencing projects. Most submitters use the BankIt (Web) or Sequin programs to format and send sequence data. Data exchange with the EMBL Data Library and the DNA Data Bank of Japan helps ensure comprehensive worldwide coverage. GenBank data is accessible through NCBI's integrated retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome and protein structure information. MEDLINE (Registered Trademark symbol) s from published articles describing the sequences are included as an additional source of biological annotation through the PubMed search system. Sequence similarity searching is offered through the BLAST series of database search programs. In addition to FTP, Email, and server/client versions of Entrez and BLAST, NCBI offers a wide range of World Wide Web retrieval and analysis services based on GenBank data. The GenBank database and related resources are freely accessible via the URL: http://www.ncbi.nlm.nih.gov


Asunto(s)
Bases de Datos Factuales , Genoma , Almacenamiento y Recuperación de la Información , Secuencia de Aminoácidos , Animales , Secuencia de Bases , Clasificación , Etiquetas de Secuencia Expresada , Biblioteca de Genes , Humanos , Internet , National Library of Medicine (U.S.) , Proteínas/genética , Homología de Secuencia , Lugares Marcados de Secuencia , Estados Unidos
9.
Nucleic Acids Res ; 26(1): 1-7, 1998 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-9399790

RESUMEN

The GenBank(R) sequence database (http://www.ncbi.nlm.nih.gov/) incorporates DNA sequences from all available public sources, primarily through the direct submission of sequence data from individual laboratories and from large-scale sequencing projects. Most submitters use the BankIt (WWW) or Sequin programs to send their sequence data. Data exchange with the EMBL Data Library and the DNA Data Bank of Japan helps ensure comprehensive worldwide coverage. GenBank data is accessible through NCBI's integrated retrieval system, Entrez , which integrates data from the major DNA and protein sequence databases along with taxonomy, genome and protein structure information. MEDLINE(R) abstracts from published articles describing the sequences are also included as an additional source of biological annotation. Sequence similarity searching is offered through the BLAST series of database search programs. In addition to FTP, e-mail and server/client versions of Entrez and BLAST, NCBI offers a wide range of World Wide Web retrieval and analysis services of interest to biologists.


Asunto(s)
Bases de Datos Factuales , National Library of Medicine (U.S.) , Animales , Secuencia de Bases , Redes de Comunicación de Computadores , ADN , Humanos , Almacenamiento y Recuperación de la Información , Estados Unidos
10.
Comput Appl Biosci ; 13(1): 75-80, 1997 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-9088712

RESUMEN

RESULTS: We have produced a computer program, named sim3, that solves the following computational problem. Two DNA sequences are given, where the shorter sequence is very similar to some contiguous region of the longer sequence. Sim3 determines such a similar region of the longer sequence, and then computes an optimal set of single-nucleotide changes (i.e. insertions, deletions or substitutions) that will convert the shorter sequence to that region. Thus, the alignment scoring scheme is designed to model sequencing errors, rather than evolutionary processes. The program can align a 100 kb sequence to a 1 megabase sequence in a few seconds on a workstation, provided that there are very few differences between the shorter sequence and some region in the longer sequence. The program has been used to assemble sequence data for the Genomes Division at the National Center for Biotechnology Information.


Asunto(s)
ADN/genética , Alineación de Secuencia/métodos , Programas Informáticos , Aldehído Deshidrogenasa/genética , Algoritmos , Secuencia de Aminoácidos , Secuencia de Bases , Redes de Comunicación de Computadores , Estudios de Evaluación como Asunto , Genoma Humano , Humanos , Modelos Genéticos , Datos de Secuencia Molecular , Alineación de Secuencia/estadística & datos numéricos , Homología de Secuencia de Aminoácido , Homología de Secuencia de Ácido Nucleico
11.
Nucleic Acids Res ; 25(1): 1-6, 1997 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-9016491

RESUMEN

The GenBank sequence database incorporates DNA sequences from all available public sources, primarily through the direct submission of sequence data from authors and from large-scale sequencing projects. Data exchange with the EMBL Data Library and the DNA Data Bank of Japan helps ensure comprehensive coverage. GenBank continues to focus on quality control and annotation while expanding data coverage and retrieval services. An integrated retrieval system, known asEntrez, incorporates data from the major DNA and protein sequence databases, along with genome maps and protein structure information. MEDLINE abstracts from published articles describing the sequences are also included as an additional source of biological annotation. Sequence similarity searching is offered through the BLAST family of programs. All of NCBI's services are offered through the World Wide Web. In addition, there are specialized server/client versions as well as FTP and e-mail server access.


Asunto(s)
Secuencia de Bases , Bases de Datos Factuales , Secuencia de Aminoácidos , Animales , Redes de Comunicación de Computadores , Humanos , National Library of Medicine (U.S.) , Programas Informáticos , Estados Unidos
12.
Nucleic Acids Res ; 24(1): 1-5, 1996 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-8594554

RESUMEN

The GenBank sequence database continues to expand its data coverage, quality control, annotation content and retrieval services. GenBank is comprised of DNA sequences submitted directly by authors as well as sequences from the other major public databases. An integrated retrieval system, known as Entrez, contains data from GenBank and from the major protein sequence and structural databases, as well as related MEDLINE abstracts. Users may access GenBank over the Internet through the World Wide Web and through special client-server programs for text and sequence similarity searching. FTP, CD-ROM and e-mail servers are alternate means of access.


Asunto(s)
Secuencia de Bases , Bases de Datos Factuales , CD-ROM , Redes de Comunicación de Computadores , Modelos Moleculares , Integración de Sistemas
13.
Comput Appl Biosci ; 11(2): 147-53, 1995 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-7620986

RESUMEN

This paper presents a practical program, called sim2, for building local alignments of two sequences, each of which may be hundreds of kilobases long. sim2 first constructs n best non-intersecting chains of 'fragments', such as all occurrences of identical 5-tuples in each of two DNA sequences, for any specified n > or = 1. Each chain is then refined by delivering an optimal alignment in a region delimited by the chain. sim2 requires only space proportional to the size of the input sequences and the output alignments, and the same source code runs on Unix machines, on Macintoshes, on PCs, and on DEC Alpha PCs. We also describe an application of sim2 for aligning long DNA sequences from Escherichia coli. sim2 facilitates contig-building by providing a complete view of the related sequences, so differences can be analyzed and inconsistencies resolved. Examples are shown using the alignment display and editing functions from the software tool ChromoScope.


Asunto(s)
ADN/genética , Alineación de Secuencia , Programas Informáticos , Secuencia de Aminoácidos , Secuencia de Bases , Datos de Secuencia Molecular
14.
Artículo en Inglés | MEDLINE | ID: mdl-7584445

RESUMEN

We present an exchange specification for data describing the three-dimensional structure of biological macromolecules. The specification was designed for MMDB, a Molecular Modeling Database supported by the National Center for Biotechnology Information (NCBI), based on information from the Protein Data Bank (PDB). In the MMDB specification, the chemical structures of molecules are described hierarchically as connectivity graphs, to directly support comparison by subgraph isomorphism or assignment algorithms. Three-dimensional coordinates are linked unambiguously to nodes in the chemical graph, so that homology-derived structures may be generated directly from alignment of chemically similar groups. In conversion to this form, data from PDB are extensively validated, so as to provide a description of chemical and spatial structure that is as accurate as possible. These changes in format and content of the known structure data are intended to support development of intelligent molecular modeling applications that make use of this invaluable information resource.


Asunto(s)
Bases de Datos Factuales , Modelos Moleculares , Proteínas/química , Secuencia de Aminoácidos , Conformación Proteica , Homología de Secuencia de Aminoácido , Programas Informáticos
15.
Nucleic Acids Res ; 22(17): 3441-4, 1994 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-7937042

RESUMEN

The GenBank sequence database continues to expand its data coverage, quality control, annotation content and retrieval services for the scientific community. Besides handling direct submissions of sequence data from authors, GenBank also incorporates DNA sequences from all available public sources; an integrated retrieval system, known as Entrez, also makes available data from the major protein sequence and structural databases, and from U.S. and European patents. MIDLINE abstracts from published articles describing the sequences are also included as an additional source of biological annotation for sequence entries. GenBank supports distribution of the data via FTP, CD-ROM, and E-mail servers. Network server-client programs provide access to an integrated database for literature retrieval and sequence similarity searching.


Asunto(s)
Secuencia de Aminoácidos , Secuencia de Bases , Bases de Datos Factuales , CD-ROM , Redes de Comunicación de Computadores , National Library of Medicine (U.S.) , Programas Informáticos , Estados Unidos
16.
Nucleic Acids Res ; 21(13): 2963-5, 1993 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-8332518

RESUMEN

The GenBank sequence database has undergone an expansion in data coverage, annotation content and the development of new services for the scientific community. In addition to nucleotide sequences, data from the major protein sequence and structural databases, and from U.S. and European patents is now included in an integrated system. MEDLINE abstracts from published articles describing the sequences provide an important new source of biological annotation for sequence entries. In addition to the continued support of existing services, new CD-ROM and network-based systems have been implemented for literature retrieval and sequence similarity searching. Major releases of GenBank are now more frequent and the data are distributed in several new forms for both end users and software developers.


Asunto(s)
Secuencia de Aminoácidos , Secuencia de Bases , ADN/genética , Bases de Datos Factuales , Almacenamiento y Recuperación de la Información , National Library of Medicine (U.S.) , Estados Unidos
17.
Comput Appl Biosci ; 8(6): 563-7, 1992 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-1468012

RESUMEN

We present a relational database program developed in FoxBase+/Mac for the viewing and manipulation of ordered restriction maps and associated features of the Escherichia coli genome including sequenced genes and the Kohara miniset of bacteriophage lambda clones. Use of this program allows easy access to the wealth of information being collected in a dataset of DNA sequences, maps and genetic data known as EcoSeq, EcoMap and EcoGene respectively.


Asunto(s)
Escherichia coli/genética , Genoma Bacteriano , Microcomputadores , Mapeo Restrictivo , Programas Informáticos , Algoritmos , Interfaz Usuario-Computador
18.
Nucleic Acids Res ; 19(3): 637-47, 1991 Feb 11.
Artículo en Inglés | MEDLINE | ID: mdl-2011534

RESUMEN

Methods are presented for organizing and integrating DNA sequence data, restriction maps, and genetic maps for the same organism but from a variety of sources (databases, publications, personal communications). Proper software tools are essential for successful organization of such diverse data into an ordered, cohesive body of information, and a suite of novel software to support this endeavor is described. Though these tools automate much of the task, a variety of strategies is needed to cope with recalcitrant cases. We describe such strategies and illustrate their application with numerous examples. These strategies have allowed us to order, analyze, and display over one megabase of E. coli DNA sequence information. The integration task often exposes inconsistencies in the available data, perhaps caused by strain polymorphisms or human oversight, necessitating the application of sound biological judgment. The examples illustrate both the level of expertise required of the database curator and the knowledge gained as apparent inconsistencies are resolved. The software and mapping methods are applicable to the study of any genome for which a high resolution restriction map is available. They were developed to support a weakly coordinated sequencing effort involving many laboratories, but would also be useful for highly orchestrated sequencing projects.


Asunto(s)
Mapeo Cromosómico , ADN Bacteriano/genética , Bases de Datos Factuales , Escherichia coli/genética , Genes Bacterianos , Secuencia de Bases , Mapeo Restrictivo , Programas Informáticos
19.
Comput Appl Biosci ; 6(3): 247-52, 1990 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-2207749

RESUMEN

This paper presents an algorithm that searches a DNA restriction enzyme map for regions that approximately match a shorter 'probe' map. Both the map and the probe consist of a sequence of address-enzyme pairs denoting restriction sites, and the algorithm penalizes a potential match for undetected or missing sites and for discrepancies in the distance between adjacent sites. The algorithm was designed specifically for comparing relatively short DNA sequences with a long restriction map, a problem that will become increasing common as large physical maps are generated. The algorithm has been used to extract information from a restriction map of the entire Escherichia coli genome.


Asunto(s)
Algoritmos , Mapeo Restrictivo , Interfaz Usuario-Computador
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA