Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 4 de 4
Filter
Add more filters










Database
Language
Publication year range
1.
Nature ; 409(6822): 934-41, 2001 Feb 15.
Article in English | MEDLINE | ID: mdl-11237014

ABSTRACT

The human genome is by far the largest genome to be sequenced, and its size and complexity present many challenges for sequence assembly. The International Human Genome Sequencing Consortium constructed a map of the whole genome to enable the selection of clones for sequencing and for the accurate assembly of the genome sequence. Here we report the construction of the whole-genome bacterial artificial chromosome (BAC) map and its integration with previous landmark maps and information from mapping efforts focused on specific chromosomal regions. We also describe the integration of sequence data with the map.


Subject(s)
Contig Mapping , Genome, Human , Chromosomes, Artificial, Bacterial , Cloning, Molecular , DNA Fingerprinting , Gene Duplication , Humans , In Situ Hybridization, Fluorescence , Repetitive Sequences, Nucleic Acid
2.
Genome Res ; 8(10): 1074-84, 1998 Oct.
Article in English | MEDLINE | ID: mdl-9799794

ABSTRACT

The currently favored approach for sequencing the human genome involves selecting representative large-insert clones (100-200 kb), randomly shearing this DNA to construct shotgun libraries, and then sequencing many different isolates from the library. This method, entitled directed random shotgun sequencing, requires highly redundant sequencing to obtain a complete and accurate finished consensus sequence. Recently it has been suggested that a rapidly generated lower redundancy sequence might be of use to the scientific community. Low-redundancy sequencing has been examined previously using simulated data sets. Here we utilize trace data from a number of projects submitted to GenBank to perform reconstruction experiments that mimic low-redundancy sequencing. These low-redundancy sequences have been examined for the completeness and quality of the consensus product, information content, and usefulness for interspecies comparisons. The data presented here suggest three different sequencing strategies, each with different utilities. (1) Nearly complete sequence data can be obtained by sequencing a random shotgun library at sixfold redundancy. This may therefore represent a good point to switch from a random to directed approach. (2) Sequencing can be performed with as little as twofold redundancy to find most of the information about exons, EST hits, and putative exon similarity matches. (3) To obtain contiguity of coding regions, sequencing at three- to fourfold redundancy would be appropriate. From these results, we suggest that a useful intermediate product for genome sequencing might be obtained by three- to fourfold redundancy. Such a product would allow a large amount of biologically useful data to be extracted while postponing the majority of work involved in producing a high quality consensus sequence.


Subject(s)
Gene Library , Sequence Analysis, DNA/methods , Animals , Contig Mapping , Expressed Sequence Tags , Genome, Human , Humans , Mice , Quality Control , Retrospective Studies , Sequence Analysis, DNA/standards
4.
Genome Res ; 8(1): 29-40, 1998 Jan.
Article in English | MEDLINE | ID: mdl-9445485

ABSTRACT

The Human Genome Project has created a formidable challenge: the extraction of biological information from extensive amounts of raw sequence. With the increasing availability of genomic sequence from other species, one approach to extracting coding and regulatory element information is through cross-species sequence comparison. To assess the strengths and weaknesses of this methodology for large-scale sequence analysis, 227 kb of mouse sequence syntenic to a gene-rich cluster on human chromosome 12p13 was obtained. Primarily through percent identity plots (PIPs) of SIM comparative sequence alignments, the sequence of coding regions, putative alternative exons, conserved noncoding regions, and correlation in repetitive element insertions were easily determined. The analysis demonstrated that the number, order, and orientation of all 17 genes are conserved between the two species, whereas two human pseudogenes are absent in mouse. In addition, apart from MIRs, no direct correlation of distribution or position of the majority of repetitive elements between the two species is seen. Finally, in examining the synonymous and nonsynonymous substitution rates in the conserved genes, a large variation in nonsynonymous rates is observed indicating that the genes in this region are diverging at different rates. This study indicates the utility and strength of large-scale cross-species sequence comparisons in the extraction of biological information from raw sequence, especially when combined with other computational tools such as GRAIL and BLAST.


Subject(s)
Chromosomes, Human, Pair 12/genetics , Chromosomes/genetics , Multigene Family , Amino Acid Sequence/genetics , Animals , Chromosome Mapping , Conserved Sequence , Humans , Mice , Molecular Sequence Data , Repetitive Sequences, Nucleic Acid , Sequence Alignment , Sequence Analysis, DNA
SELECTION OF CITATIONS
SEARCH DETAIL
...