Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
Add more filters










Database
Language
Publication year range
3.
Nature ; 409(6822): 928-33, 2001 Feb 15.
Article in English | MEDLINE | ID: mdl-11237013

ABSTRACT

We describe a map of 1.42 million single nucleotide polymorphisms (SNPs) distributed throughout the human genome, providing an average density on available sequence of one SNP every 1.9 kilobases. These SNPs were primarily discovered by two projects: The SNP Consortium and the analysis of clone overlaps by the International Human Genome Sequencing Consortium. The map integrates all publicly available SNPs with described genes and other genomic features. We estimate that 60,000 SNPs fall within exon (coding and untranslated regions), and 85% of exons are within 5 kb of the nearest SNP. Nucleotide diversity varies greatly across the genome, in a manner broadly consistent with a standard population genetic model of human history. This high-density SNP map provides a public resource for defining haplotype variation across the genome, and should help to identify biomedically important genes for diagnosis and therapy.


Subject(s)
Genetic Variation , Genome, Human , Polymorphism, Single Nucleotide , Chromosome Mapping , Genetics, Medical , Genetics, Population , Humans , Nucleotides
4.
Nat Genet ; 27(4): 371-2, 2001 Apr.
Article in English | MEDLINE | ID: mdl-11279516

ABSTRACT

There is a concerted effort by a number of public and private groups to identify a large set of human single-nucleotide polymorphisms (SNPs). As of March 2001, 2.84 million SNPs have been deposited in the public database, dbSNP, at the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/SNP/). The 2.84 million SNPs can be grouped into 1.65 million non-redundant SNPs. As part of the International SNP Map Working Group, we recently published a high-density SNP map of the human genome consisting of 1.42 million SNPs (ref. 3). In addition, numerous SNPs are maintained in proprietary databases. Our survey of more than 1,200 SNPs indicates that more than 80% of TSC and Washington University candidate SNPs are polymorphic and that approximately 50% of the candidate SNPs from these two sources are common SNPs (with minor allele frequency of > or =20%) in any given population.


Subject(s)
Polymorphism, Single Nucleotide , DNA/genetics , Humans , Polymerase Chain Reaction
5.
Nat Genet ; 23(4): 452-6, 1999 Dec.
Article in English | MEDLINE | ID: mdl-10581034

ABSTRACT

Single-nucleotide polymorphisms (SNPs) are the most abundant form of human genetic variation and a resource for mapping complex genetic traits. The large volume of data produced by high-throughput sequencing projects is a rich and largely untapped source of SNPs (refs 2, 3, 4, 5). We present here a unified approach to the discovery of variations in genetic sequence data of arbitrary DNA sources. We propose to use the rapidly emerging genomic sequence as a template on which to layer often unmapped, fragmentary sequence data and to use base quality values to discern true allelic variations from sequencing errors. By taking advantage of the genomic sequence we are able to use simpler yet more accurate methods for sequence organization: fragment clustering, paralogue identification and multiple alignment. We analyse these sequences with a novel, Bayesian inference engine, POLYBAYES, to calculate the probability that a given site is polymorphic. Rigorous treatment of base quality permits completely automated evaluation of the full length of all sequences, without limitations on alignment depth. We demonstrate this approach by accurate SNP predictions in human ESTs aligned to finished and working-draft quality genomic sequences, a data set representative of the typical challenges of sequence-based SNP discovery.


Subject(s)
Genetic Techniques , Polymorphism, Single Nucleotide , Algorithms , Alleles , Bayes Theorem , Data Interpretation, Statistical , Expressed Sequence Tags , Genetic Variation , Genome, Human , Humans , Sequence Alignment , Software
6.
Genome Res ; 8(3): 260-7, 1998 Mar.
Article in English | MEDLINE | ID: mdl-9521929

ABSTRACT

Large-scale genomic sequencing requires a software infrastructure to support and integrate applications that are not directly compatible. We describe a suite of software tools built around the Common Assembly Format (CAF), a comprehensive representation of a sequence assembly as a text file. These tools form the backbone of sequencing informatics at the Sanger Centre and the Genome Sequencing Center. The CAF format is intentionally flexible, and our Perl and C libraries, which parse and manipulate it, provide powerful tools for creating new applications as well as wrappers to incorporate other software. The tools are available free by anonymous FTP from ftp://ftp.sanger.ac.uk/pub/badger/.


Subject(s)
Base Sequence , Genome , Sequence Analysis, DNA/methods , Algorithms , Computational Biology/methods , Databases, Factual , Gene Library , Sequence Alignment
SELECTION OF CITATIONS
SEARCH DETAIL
...