Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 21
Filter
Add more filters










Publication year range
1.
Nucleic Acids Res ; 52(7): 3823-3836, 2024 Apr 24.
Article in English | MEDLINE | ID: mdl-38421639

ABSTRACT

Alternative splicing and multiple transcription start and termination sites can produce a diverse repertoire of mRNA transcript variants from a given gene. While the full picture of the human transcriptome is still incomplete, publicly available RNA datasets have enabled the assembly of transcripts. Using publicly available deep sequencing data from 927 human samples across 48 tissues, we quantified known and new transcript variants, provide an interactive, browser-based application Splice-O-Mat and demonstrate its relevance using adhesion G protein-coupled receptors (aGPCRs) as an example. On average, 24 different transcript variants were detected for each of the 33 human aGPCR genes, and several dominant transcript variants were not yet annotated. Variable transcription starts and complex exon-intron structures encode a flexible protein domain architecture of the N- and C termini and the seven-transmembrane helix domain (7TMD). Notably, we discovered the first GPCR (ADGRG7/GPR128) with eight transmembrane helices. Both the N- and C terminus of this aGPCR were intracellularly oriented, anchoring the N terminus in the plasma membrane. Moreover, the assessment of tissue-specific transcript variants, also for other gene classes, in our application may change the evaluation of disease-causing mutations, as their position in different transcript variants may explain tissue-specific phenotypes.


Subject(s)
Alternative Splicing , High-Throughput Nucleotide Sequencing , Receptors, G-Protein-Coupled , Humans , Receptors, G-Protein-Coupled/genetics , Receptors, G-Protein-Coupled/metabolism , Receptors, G-Protein-Coupled/chemistry , Transcriptome/genetics , RNA, Messenger/genetics , RNA, Messenger/metabolism , RNA, Messenger/chemistry , Exons/genetics , Protein Domains
2.
Bioinformatics ; 31(5): 770-2, 2015 Mar 01.
Article in English | MEDLINE | ID: mdl-25359895

ABSTRACT

MOTIVATION: Pooling multiple samples increases the efficiency and lowers the cost of DNA sequencing. One approach to multiplexing is to use short DNA indices to uniquely identify each sample. After sequencing, reads must be assigned in silico to the sample of origin, a process referred to as demultiplexing. Demultiplexing software typically identifies the sample of origin using a fixed number of mismatches between the read index and a reference index set. This approach may fail or misassign reads when the sequencing quality of the indices is poor. RESULTS: We introduce deML, a maximum likelihood algorithm that demultiplexes Illumina sequences. deML computes the likelihood of an observed index sequence being derived from a specified sample. A quality score which reflects the probability of the assignment being correct is generated for each read. Using these quality scores, even very problematic datasets can be demultiplexed and an error threshold can be set. AVAILABILITY AND IMPLEMENTATION: deML is freely available for use under the GPL (http://bioinf.eva.mpg.de/deml/).


Subject(s)
Algorithms , Sequence Analysis, DNA/instrumentation , Sequence Analysis, DNA/methods , Software , Humans , Likelihood Functions
3.
Nucleic Acids Res ; 42(18): e141, 2014 Oct.
Article in English | MEDLINE | ID: mdl-25100869

ABSTRACT

The sequencing of libraries containing molecules shorter than the read length, such as in ancient or forensic applications, may result in the production of reads that include the adaptor, and in paired reads that overlap one another. Challenges for the processing of such reads are the accurate identification of the adaptor sequence and accurate reconstruction of the original sequence most likely to have given rise to the observed read(s). We introduce an algorithm that removes the adaptors and reconstructs the original DNA sequences using a Bayesian maximum a posteriori probability approach. Our algorithm is faster, and provides a more accurate reconstruction of the original sequence for both simulated and ancient DNA data sets, than other approaches. leeHom is released under the GPLv3 and is freely available from: https://bioinf.eva.mpg.de/leehom/


Subject(s)
Algorithms , Sequence Analysis, DNA/methods , Bayes Theorem , Likelihood Functions , Software
4.
Proc Natl Acad Sci U S A ; 111(18): 6666-71, 2014 May 06.
Article in English | MEDLINE | ID: mdl-24753607

ABSTRACT

We present the DNA sequence of 17,367 protein-coding genes in two Neandertals from Spain and Croatia and analyze them together with the genome sequence recently determined from a Neandertal from southern Siberia. Comparisons with present-day humans from Africa, Europe, and Asia reveal that genetic diversity among Neandertals was remarkably low, and that they carried a higher proportion of amino acid-changing (nonsynonymous) alleles inferred to alter protein structure or function than present-day humans. Thus, Neandertals across Eurasia had a smaller long-term effective population than present-day humans. We also identify amino acid substitutions in Neandertals and present-day humans that may underlie phenotypic differences between the two groups. We find that genes involved in skeletal morphology have changed more in the lineage leading to Neandertals than in the ancestral lineage common to archaic and modern humans, whereas genes involved in behavior and pigmentation have changed more on the modern human lineage.


Subject(s)
Exome , Genetic Variation , Neanderthals/genetics , Amino Acid Substitution , Animals , Croatia , DNA/genetics , Gene Frequency , Humans , Paleontology , Phylogeny , Polymorphism, Single Nucleotide , Siberia , Spain
5.
Bioinformatics ; 29(9): 1208-9, 2013 May 01.
Article in English | MEDLINE | ID: mdl-23471300

ABSTRACT

MOTIVATION: The conversion of the raw intensities obtained from next-generation sequencing platforms into nucleotide sequences with well-calibrated quality scores is a critical step in the generation of good sequence data. While recent model-based approaches can yield highly accurate calls, they require a substantial amount of processing time and/or computational resources. We previously introduced Ibis, a fast and accurate basecaller for the Illumina platform. We have continued active development of Ibis to take into account developments in the Illumina technology, as well as to make Ibis fully open source. RESULTS: We introduce here freeIbis, which offers significant improvements in sequence accuracy owing to the use of a novel multiclass support vector machine (SVM) algorithm. Sequence quality scores are now calibrated based on empirically observed scores, thus providing a high correlation to their respective error rates. These improvements result in downstream advantages including improved genotyping accuracy. AVAILABILITY AND IMPLEMENTATION: FreeIbis is freely available for use under the GPL (http://bioinf.eva.mpg.de/freeibis/). It requires a Python interpreter and a C++ compiler. Tailored versions of LIBOCAS and LIBLINEAR are distributed along with the package.


Subject(s)
Sequence Analysis, DNA/standards , Software , Calibration , Genotyping Techniques/standards , Reproducibility of Results , Sequence Analysis, DNA/instrumentation , Support Vector Machine
6.
Proc Natl Acad Sci U S A ; 110(6): 2223-7, 2013 Feb 05.
Article in English | MEDLINE | ID: mdl-23341637

ABSTRACT

Hominins with morphology similar to present-day humans appear in the fossil record across Eurasia between 40,000 and 50,000 y ago. The genetic relationships between these early modern humans and present-day human populations have not been established. We have extracted DNA from a 40,000-y-old anatomically modern human from Tianyuan Cave outside Beijing, China. Using a highly scalable hybridization enrichment strategy, we determined the DNA sequences of the mitochondrial genome, the entire nonrepetitive portion of chromosome 21 (∼30 Mbp), and over 3,000 polymorphic sites across the nuclear genome of this individual. The nuclear DNA sequences determined from this early modern human reveal that the Tianyuan individual derived from a population that was ancestral to many present-day Asians and Native Americans but postdated the divergence of Asians from Europeans. They also show that this individual carried proportions of DNA variants derived from archaic humans similar to present-day people in mainland Asia.


Subject(s)
DNA, Mitochondrial/genetics , Hominidae/genetics , Animals , Asian People/genetics , Asian People/history , Base Sequence , China , Chromosomes, Human, Pair 21/genetics , DNA, Mitochondrial/history , DNA, Mitochondrial/isolation & purification , Fossils , Gene Library , Genetics, Population , History, Ancient , Humans , Molecular Sequence Data , Phylogeny , Phylogeography , Sequence Analysis, DNA , Sequence Homology, Nucleic Acid
7.
Science ; 338(6104): 222-6, 2012 Oct 12.
Article in English | MEDLINE | ID: mdl-22936568

ABSTRACT

We present a DNA library preparation method that has allowed us to reconstruct a high-coverage (30×) genome sequence of a Denisovan, an extinct relative of Neandertals. The quality of this genome allows a direct estimation of Denisovan heterozygosity indicating that genetic diversity in these archaic hominins was extremely low. It also allows tentative dating of the specimen on the basis of "missing evolution" in its genome, detailed measurements of Denisovan and Neandertal admixture into present-day human populations, and the generation of a near-complete catalog of genetic changes that swept to high frequency in modern humans since their divergence from Denisovans.


Subject(s)
Genetic Variation , Genome, Human/genetics , Heterozygote , Neanderthals/genetics , Alleles , Animals , Base Sequence , Fossils , Gene Flow , Gene Library , Humans , Molecular Sequence Data , Sequence Analysis, DNA
8.
Nature ; 468(7327): 1053-60, 2010 Dec 23.
Article in English | MEDLINE | ID: mdl-21179161

ABSTRACT

Using DNA extracted from a finger bone found in Denisova Cave in southern Siberia, we have sequenced the genome of an archaic hominin to about 1.9-fold coverage. This individual is from a group that shares a common origin with Neanderthals. This population was not involved in the putative gene flow from Neanderthals into Eurasians; however, the data suggest that it contributed 4-6% of its genetic material to the genomes of present-day Melanesians. We designate this hominin population 'Denisovans' and suggest that it may have been widespread in Asia during the Late Pleistocene epoch. A tooth found in Denisova Cave carries a mitochondrial genome highly similar to that of the finger bone. This tooth shares no derived morphological features with Neanderthals or modern humans, further indicating that Denisovans have an evolutionary history distinct from Neanderthals and modern humans.


Subject(s)
Fossils , Gene Flow , Genome/genetics , Hominidae/classification , Hominidae/genetics , Animals , Asia , DNA, Mitochondrial/genetics , Europe , Finger Phalanges/chemistry , Humans , Melanesia , Molecular Sequence Data , Phylogeny , Siberia , Tooth/anatomy & histology , Tooth/chemistry
9.
Nucleic Acids Res ; 38(16): e161, 2010 Sep.
Article in English | MEDLINE | ID: mdl-20587499

ABSTRACT

Although the last few years have seen great progress in DNA sequence retrieval from fossil specimens, some of the characteristics of ancient DNA remain poorly understood. This is particularly true for blocking lesions, i.e. chemical alterations that cannot be bypassed by DNA polymerases and thus prevent amplification and subsequent sequencing of affected molecules. Some studies have concluded that the vast majority of ancient DNA molecules carry blocking lesions, suggesting that the removal, repair or bypass of blocking lesions might dramatically increase both the time depth and geographical range of specimens available for ancient DNA analysis. However, previous studies used very indirect detection methods that did not provide conclusive estimates on the frequency of blocking lesions in endogenous ancient DNA. We developed a new method, polymerase extension profiling (PEP), that directly reveals occurrences of polymerase stalling on DNA templates. By sequencing thousands of single primer extension products using PEP methodology, we have for the first time directly identified blocking lesions in ancient DNA on a single molecule level. Although we found clear evidence for blocking lesions in three out of four ancient samples, no more than 40% of the molecules were affected in any of the samples, indicating that such modifications are far less frequent in ancient DNA than previously thought.


Subject(s)
DNA Damage , DNA-Directed DNA Polymerase , Sequence Analysis, DNA/methods , Fossils , Genomics , Polymerase Chain Reaction , Templates, Genetic
10.
Science ; 328(5979): 710-722, 2010 May 07.
Article in English | MEDLINE | ID: mdl-20448178

ABSTRACT

Neandertals, the closest evolutionary relatives of present-day humans, lived in large parts of Europe and western Asia before disappearing 30,000 years ago. We present a draft sequence of the Neandertal genome composed of more than 4 billion nucleotides from three individuals. Comparisons of the Neandertal genome to the genomes of five present-day humans from different parts of the world identify a number of genomic regions that may have been affected by positive selection in ancestral modern humans, including genes involved in metabolism and in cognitive and skeletal development. We show that Neandertals shared more genetic variants with present-day humans in Eurasia than with present-day humans in sub-Saharan Africa, suggesting that gene flow from Neandertals into the ancestors of non-Africans occurred before the divergence of Eurasian groups from each other.


Subject(s)
Fossils , Genome, Human , Genome , Hominidae/genetics , Sequence Analysis, DNA , Animals , Asian People/genetics , Base Sequence , Black People/genetics , Bone and Bones , DNA, Mitochondrial/genetics , Evolution, Molecular , Extinction, Biological , Female , Gene Dosage , Gene Flow , Genetic Variation , Haplotypes , Humans , Pan troglodytes/genetics , Polymorphism, Single Nucleotide , Selection, Genetic , Sequence Alignment , Time , White People/genetics
11.
Genome Biol ; 11(5): R47, 2010.
Article in English | MEDLINE | ID: mdl-20441577

ABSTRACT

High-throughput sequencing technologies have opened up a new avenue for studying extinct organisms. Here we identify and quantify biases introduced by particular characteristics of ancient DNA samples. These analyses demonstrate the importance of closely related genomic sequence for correctly identifying and classifying bona fide endogenous DNA fragments. We show that more accurate genome divergence estimates from ancient DNA sequence can be attained using at least two outgroup genomes and appropriate filtering.


Subject(s)
Computational Biology/methods , DNA/analysis , DNA/genetics , Extinction, Biological , Sequence Analysis, DNA/methods , Animals , Base Sequence , Computer Simulation , Databases, Nucleic Acid , Genetic Variation , Humans , Sequence Alignment
12.
Nucleic Acids Res ; 38(6): e87, 2010 Apr.
Article in English | MEDLINE | ID: mdl-20028723

ABSTRACT

DNA sequences determined from ancient organisms have high error rates, primarily due to uracil bases created by cytosine deamination. We use synthetic oligonucleotides, as well as DNA extracted from mammoth and Neandertal remains, to show that treatment with uracil-DNA-glycosylase and endonuclease VIII removes uracil residues from ancient DNA and repairs most of the resulting abasic sites, leaving undamaged parts of the DNA fragments intact. Neandertal DNA sequences determined with this protocol have greatly increased accuracy. In addition, our results demonstrate that Neandertal DNA retains in vivo patterns of CpG methylation, potentially allowing future studies of gene inactivation and imprinting in ancient organisms.


Subject(s)
DNA Methylation , Sequence Analysis, DNA/methods , Animals , CpG Islands , Cytosine/chemistry , DNA Repair , Deamination , Deoxyribonuclease (Pyrimidine Dimer) , Hominidae/genetics , Humans , Mammoths/genetics , Oligonucleotides/chemistry , Uracil-DNA Glycosidase
13.
J Vis Exp ; (31): 1573, 2009 Sep 03.
Article in English | MEDLINE | ID: mdl-19730410

ABSTRACT

We present a method of targeted DNA sequence retrieval from DNA sources which are heavily degraded and contaminated with microbial DNA, as is typical of ancient bones. The method greatly reduces sample destruction and sequencing demands relative to direct PCR or shotgun sequencing approaches. We used this method to reconstruct the complete mitochondrial DNA (mtDNA) genomes of five Neandertals from across their geographic range. The mtDNA genetic diversity of the late Neandertals was approximately three times lower than that of contemporary modern humans. Together with analyses of mtDNA protein evolution, these data suggest that the long-term effective population size of Neandertals was smaller than that of modern humans and extant great apes.


Subject(s)
DNA Primers/genetics , DNA, Mitochondrial/analysis , Hominidae/genetics , Animals , Base Sequence , DNA Primers/chemistry , DNA, Mitochondrial/genetics , DNA, Mitochondrial/isolation & purification , Fossils , Humans , Molecular Sequence Data
14.
Genome Biol ; 10(8): R83, 2009.
Article in English | MEDLINE | ID: mdl-19682367

ABSTRACT

The Illumina Genome Analyzer generates millions of short sequencing reads. We present Ibis (Improved base identification system), an accurate, fast and easy-to-use base caller that significantly reduces the error rate and increases the output of usable reads. Ibis is faster and more robust with respect to chemistry and technology than other publicly available packages. Ibis is freely available under the GPL from http://bioinf.eva.mpg.de/Ibis/.


Subject(s)
Sequence Analysis, DNA/methods , Software , Algorithms , Artificial Intelligence , Genome , Humans
15.
Science ; 325(5938): 318-21, 2009 Jul 17.
Article in English | MEDLINE | ID: mdl-19608918

ABSTRACT

Analysis of Neandertal DNA holds great potential for investigating the population history of this group of hominins, but progress has been limited due to the rarity of samples and damaged state of the DNA. We present a method of targeted ancient DNA sequence retrieval that greatly reduces sample destruction and sequencing demands and use this method to reconstruct the complete mitochondrial DNA (mtDNA) genomes of five Neandertals from across their geographic range. We find that mtDNA genetic diversity in Neandertals that lived 38,000 to 70,000 years ago was approximately one-third of that in contemporary modern humans. Together with analyses of mtDNA protein evolution, these data suggest that the long-term effective population size of Neandertals was smaller than that of modern humans and extant great apes.


Subject(s)
DNA, Mitochondrial/genetics , Fossils , Genome, Mitochondrial , Hominidae/genetics , Sequence Analysis, DNA , Animals , Bayes Theorem , DNA Primers , DNA, Mitochondrial/analysis , DNA, Mitochondrial/isolation & purification , Evolution, Molecular , Female , Gene Library , Genetic Variation , Genome, Human , Geography , Humans , Male , Molecular Sequence Data , Phylogeny , Population Density
16.
Genome Res ; 19(10): 1843-8, 2009 Oct.
Article in English | MEDLINE | ID: mdl-19635845

ABSTRACT

Although the emergence of high-throughput sequencing technologies has enabled whole-genome sequencing from extinct organisms, little progress has been made in accelerating targeted sequencing from highly degraded DNA. Here, we present a novel and highly sensitive method for targeted sequencing of ancient and degraded DNA, which couples multiplex PCR directly with sample barcoding and high-throughput sequencing. Using this approach, we obtained a 96% complete mitochondrial genome data set from 31 cave bear (Ursus spelaeus) samples using only two 454 Life Sciences (Roche) GS FLX runs. In contrast to previous studies relying only on short sequence fragments, the overlapping portion of our data comprises almost 10 kb of replicated mitochondrial genome sequence, allowing for the unambiguous differentiation of three major cave bear clades. Our method opens up the opportunity to simultaneously generate many kilobases of overlapping sequence data from large sets of difficult samples, such as museum specimens, medical collections, or forensic samples. Embedded in our approach, we present a new protocol for the construction of barcoded sequencing libraries, which is compatible with all current high-throughput technologies and can be performed entirely in plate setup.


Subject(s)
DNA Degradation, Necrotic , DNA, Mitochondrial/analysis , Fossils , Polymerase Chain Reaction/methods , Sequence Analysis, DNA/methods , Animals , DNA Primers/genetics , Electronic Data Processing/instrumentation , Electronic Data Processing/methods , Genome, Mitochondrial , Mammoths/genetics , Models, Biological , Molecular Sequence Data , Phylogeny , Sequence Alignment/instrumentation , Sequence Alignment/methods , Sequence Analysis, DNA/instrumentation , Ursidae/genetics
17.
Cell ; 134(3): 416-26, 2008 Aug 08.
Article in English | MEDLINE | ID: mdl-18692465

ABSTRACT

A complete mitochondrial (mt) genome sequence was reconstructed from a 38,000 year-old Neandertal individual with 8341 mtDNA sequences identified among 4.8 Gb of DNA generated from approximately 0.3 g of bone. Analysis of the assembled sequence unequivocally establishes that the Neandertal mtDNA falls outside the variation of extant human mtDNAs, and allows an estimate of the divergence date between the two mtDNA lineages of 660,000 +/- 140,000 years. Of the 13 proteins encoded in the mtDNA, subunit 2 of cytochrome c oxidase of the mitochondrial electron transport chain has experienced the largest number of amino acid substitutions in human ancestors since the separation from Neandertals. There is evidence that purifying selection in the Neandertal mtDNA was reduced compared with other primate lineages, suggesting that the effective population size of Neandertals was small.


Subject(s)
Evolution, Molecular , Fossils , Hominidae/genetics , Sequence Analysis, DNA/methods , Animals , Base Sequence , Bone and Bones/metabolism , Croatia , Cyclooxygenase 2/chemistry , DNA, Mitochondrial/genetics , Genome, Mitochondrial , Humans , Models, Molecular , Molecular Sequence Data
18.
Bioinformatics ; 24(13): 1530-1, 2008 Jul 01.
Article in English | MEDLINE | ID: mdl-18467344

ABSTRACT

UNLABELLED: We present a tool suited for searching for many short nucleotide sequences in large databases, allowing for a predefined number of gaps and mismatches. The commandline-driven program implements a non-deterministic automata matching algorithm on a keyword tree of the search strings. Both queries with and without ambiguity codes can be searched. Search time is short for perfect matches, and retrieval time rises exponentially with the number of edits allowed. AVAILABILITY: The C++ source code for PatMaN is distributed under the GNU General Public License and has been tested on the GNU/Linux operating system. It is available from http://bioinf.eva.mpg.de/patman. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Algorithms , Database Management Systems , Databases, Genetic , Information Storage and Retrieval/methods , Sequence Alignment/methods , Sequence Analysis, DNA/methods , Software , Base Sequence , Molecular Sequence Data
19.
Nat Protoc ; 3(2): 267-78, 2008.
Article in English | MEDLINE | ID: mdl-18274529

ABSTRACT

Parallel tagged sequencing (PTS) is a molecular barcoding method designed to adapt the recently developed high-throughput 454 parallel sequencing technology for use with multiple samples. Unlike other barcoding methods, PTS can be applied to any type of double-stranded DNA (dsDNA) sample, including shotgun DNA libraries and pools of PCR products, and requires no amplification or gel purification steps. The method relies on attaching sample-specific barcoding adapters, which include sequence tags and a restriction site, to blunt-end repaired DNA samples by ligation and strand-displacement. After pooling multiple barcoded samples, molecules without sequence tags are effectively excluded from sequencing by dephosphorylation and restriction digestion, and using the tag sequences, the source of each DNA sequence can be traced. This protocol allows for sequencing 300 or more complete mitochondrial genomes on a single 454 GS FLX run, or twenty-five 6-kb plasmid sequences on only one 16th plate region. Most of the reactions can be performed in a multichannel setup on 96-well reaction plates, allowing for processing up to several hundreds of samples in a few days.


Subject(s)
Sequence Analysis, DNA/methods , Sequence Tagged Sites , DNA, Mitochondrial/chemistry , Genome, Mitochondrial/genetics , Humans , Plasmids/chemistry
20.
Nucleic Acids Res ; 35(15): e97, 2007.
Article in English | MEDLINE | ID: mdl-17670798

ABSTRACT

High-throughput 454 DNA sequencing technology allows much faster and more cost-effective sequencing than traditional Sanger sequencing. However, the technology imposes inherent limitations on the number of samples that can be processed in parallel. Here we introduce parallel tagged sequencing (PTS), a simple, inexpensive and flexible barcoding technique that can be used for parallel sequencing any number and type of double-stranded nucleic acid samples. We demonstrate that PTS is particularly powerful for sequencing contiguous DNA fragments such as mtDNA genomes: in theory as many as 250 mammalian mtDNA genomes can be sequenced in a single GS FLX run. PTS dramatically increases the sequencing throughput of samples in parallel and thus fully mobilizes the resources of the 454 technology for targeted sequencing.


Subject(s)
Sequence Analysis, DNA/methods , Sequence Tagged Sites , DNA, Mitochondrial/chemistry , Gene Library , Genome, Human , Humans , Reproducibility of Results
SELECTION OF CITATIONS
SEARCH DETAIL
...