Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 48
Filter
1.
Hum Mutat ; 43(12): 1979-1993, 2022 12.
Article in English | MEDLINE | ID: mdl-36054329

ABSTRACT

Detection of de novo variants (DNVs) is critical for studies of disease-related variation and mutation rates. To accelerate DNV calling, we developed a graphics processing units-based workflow. We applied our workflow to whole-genome sequencing data from three parent-child sequenced cohorts including the Simons Simplex Collection (SSC), Simons Foundation Powering Autism Research (SPARK), and the 1000 Genomes Project (1000G) that were sequenced using DNA from blood, saliva, and lymphoblastoid cell lines (LCLs), respectively. The SSC and SPARK DNV callsets were within expectations for number of DNVs, percent at CpG sites, phasing to the paternal chromosome of origin, and average allele balance. However, the 1000G DNV callset was not within expectations and contained excessive DNVs that are likely cell line artifacts. Mutation signature analysis revealed 30% of 1000G DNV signatures matched B-cell lymphoma. Furthermore, we found variants in DNA repair genes and at Clinvar pathogenic or likely-pathogenic sites and significant excess of protein-coding DNVs in IGLL5; a gene known to be involved in B-cell lymphomas. Our study provides a new rapid DNV caller for the field and elucidates important implications of using sequencing data from LCLs for reference building and disease-related projects.


Subject(s)
Neoplasms , Humans , Alleles , Mutation , Neoplasms/genetics , Whole Genome Sequencing
2.
Brief Bioinform ; 23(5)2022 09 20.
Article in English | MEDLINE | ID: mdl-35945154

ABSTRACT

As recently demonstrated by the COVID-19 pandemic, large-scale pathogen genomic data are crucial to characterize transmission patterns of human infectious diseases. Yet, current methods to process raw sequence data into analysis-ready variants remain slow to scale, hampering rapid surveillance efforts and epidemiological investigations for disease control. Here, we introduce an accelerated, scalable, reproducible, and cost-effective framework for pathogen genomic variant identification and present an evaluation of its performance and accuracy across benchmark datasets of Plasmodium falciparum malaria genomes. We demonstrate superior performance of the GPU framework relative to standard pipelines with mean execution time and computational costs reduced by 27× and 4.6×, respectively, while delivering 99.9% accuracy at enhanced reproducibility.


Subject(s)
COVID-19 , Communicable Diseases , Malaria , COVID-19/epidemiology , COVID-19/genetics , Genomics/methods , Humans , Pandemics , Reproducibility of Results
3.
Article in English | MEDLINE | ID: mdl-30301868

ABSTRACT

Endometrial cancer is the most common gynecologic malignancy in industrialized countries, and both its incidence and its associated mortality are increasing. The "liquid biopsy" is becoming an important transformative precision oncology tool, but barriers intrinsic to blood sampling have limited its use in early cancer detection. We hypothesized that using a more targeted sample for analysis-namely, a uterine lavage-should provide a more sensitive and specific diagnostic test for endometrial cancer. Using a custom 12-gene endometrial cancer panel, molecular analysis of uterine lavage fluid from an asymptomatic 67-yr-old female without histopathologic evidence of premalignant lesions or cancer in her uterine tissue revealed two oncogenic PTEN mutations. Ten months later, the patient returned with postmenopausal bleeding and a single microscopic focus of endometrial cancer. DNA isolated and sequenced from laser-capture microdissected tumor tissue revealed the same two PTEN mutations. These mutations were unlikely to occur by chance alone (P < 3 × 10-7). This illustrative case provides the first demonstration that future, tumor-specific mutations can be identified in an asymptomatic individual without clinical or pathologic evidence of cancer by using already established sequencing technologies but targeted sampling methods. This finding provides the basis for new opportunities in early cancer screening, detection, and prevention.


Subject(s)
Endometrial Neoplasms/diagnosis , Endometrial Neoplasms/genetics , Genital Neoplasms, Female/diagnosis , Aged , Biopsy , Endometrium/metabolism , Female , Humans , Liquid Biopsy/methods , Mutation , PTEN Phosphohydrolase/genetics , Postmenopause , Precision Medicine , Therapeutic Irrigation/methods , Uterine Hemorrhage , Uterus/cytology
4.
PLoS One ; 13(4): e0195272, 2018.
Article in English | MEDLINE | ID: mdl-29630678

ABSTRACT

The accurate detection of ultralow allele frequency variants in DNA samples is of interest in both research and medical settings, particularly in liquid biopsies where cancer mutational status is monitored from circulating DNA. Next-generation sequencing (NGS) technologies employing molecular barcoding have shown promise but significant sensitivity and specificity improvements are still needed to detect mutations in a majority of patients before the metastatic stage. To address this we present analytical validation data for ERASE-Seq (Elimination of Recurrent Artifacts and Stochastic Errors), a method for accurate and sensitive detection of ultralow frequency DNA variants in NGS data. ERASE-Seq differs from previous methods by creating a robust statistical framework to utilize technical replicates in conjunction with background error modeling, providing a 10 to 100-fold reduction in false positive rates compared to published molecular barcoding methods. ERASE-Seq was tested using spiked human DNA mixtures with clinically realistic DNA input quantities to detect SNVs and indels between 0.05% and 1% allele frequency, the range commonly found in liquid biopsy samples. Variants were detected with greater than 90% sensitivity and a false positive rate below 0.1 calls per 10,000 possible variants. The approach represents a significant performance improvement compared to molecular barcoding methods and does not require changing molecular reagents.


Subject(s)
High-Throughput Nucleotide Sequencing/statistics & numerical data , Sequence Analysis, DNA/statistics & numerical data , Cell Line , Computational Biology , DNA Barcoding, Taxonomic/statistics & numerical data , Gene Frequency , Gene Library , Genetic Variation , Humans , INDEL Mutation
5.
Int J Gynecol Cancer ; 28(3): 479-485, 2018 03.
Article in English | MEDLINE | ID: mdl-29324546

ABSTRACT

OBJECTIVES: The objectives of this study were to assess if targeted investigation for tumor-specific mutations by ultradeep DNA sequencing of peritoneal washes of ovarian cancer patients after primary surgical debulking and chemotherapy, and clinically diagnosed as disease free, provides a more sensitive and specific method to assess actual treatment response and tailor future therapy and to compare this "molecular second look" with conventional cytology and histopathology-based findings. METHODS/MATERIALS: We identified 10 patients with advanced-stage, high-grade serous ovarian cancer who had undergone second-look laparoscopy and for whom DNA could be isolated from biobanked paired blood, primary and recurrent tumor, and second-look peritoneal washes. A targeted 56 gene cancer-relevant panel was used for next-generation sequencing (average coverage, >6500×). Mutations were validated using either digital droplet polymerase chain reaction (ddPCR) or Sanger sequencing. RESULTS: A total of 25 tumor-specific mutations were identified (median, 2/patient; range, 1-8). TP53 mutations were identified in at least 1 sample from all patients. All 5 pathology-based second-look positive patients were confirmed positive by molecular second look. Genetic analysis revealed that 3 of the 5 pathology-based negative second looks were actually positive. In the 2 patients, the second-look mutations were present in either the original primary or recurrent tumors. In the third, 2 high-frequency, novel frameshift mutations in MSH6 and HNF1A were identified. CONCLUSIONS: The molecular second look detects tumor-specific evidence of residual disease and provides genetic insight into tumor evolution and future recurrences beyond standard pathology. In the precision medicine era, detecting and genetically characterizing residual disease after standard treatment will be invaluable for improving patient outcomes.


Subject(s)
Cystadenocarcinoma, Serous/genetics , Ovarian Neoplasms/genetics , Aged , Alleles , Cystadenocarcinoma, Serous/pathology , DNA Mutational Analysis , DNA, Neoplasm/genetics , DNA, Neoplasm/isolation & purification , Female , High-Throughput Nucleotide Sequencing , Humans , Middle Aged , Mutation , Ovarian Neoplasms/pathology , Precision Medicine/methods , Proof of Concept Study
6.
Science ; 357(6351): 600-604, 2017 08 11.
Article in English | MEDLINE | ID: mdl-28798132

ABSTRACT

The mammalian brain contains diverse neuronal types, yet we lack single-cell epigenomic assays that are able to identify and characterize them. DNA methylation is a stable epigenetic mark that distinguishes cell types and marks regulatory elements. We generated >6000 methylomes from single neuronal nuclei and used them to identify 16 mouse and 21 human neuronal subpopulations in the frontal cortex. CG and non-CG methylation exhibited cell type-specific distributions, and we identified regulatory elements with differential methylation across neuron types. Methylation signatures identified a layer 6 excitatory neuron subtype and a unique human parvalbumin-expressing inhibitory neuron subtype. We observed stronger cross-species conservation of regulatory elements in inhibitory neurons than in excitatory neurons. Single-nucleus methylomes expand the atlas of brain cell types and identify regulatory elements that drive conserved brain cell diversity.


Subject(s)
DNA Methylation , Epigenesis, Genetic , Frontal Lobe/metabolism , Neurons/metabolism , Regulatory Sequences, Nucleic Acid , 5-Methylcytosine/chemistry , Adult , Animals , Base Sequence , Cell Nucleus/metabolism , Conserved Sequence , Cytosine/chemistry , Frontal Lobe/cytology , Humans , Male , Mice , Mice, Inbred C57BL , Sequence Analysis, DNA , Single-Cell Analysis
7.
PLoS Med ; 13(12): e1002206, 2016 Dec.
Article in English | MEDLINE | ID: mdl-28027320

ABSTRACT

BACKGROUND: Endometrial cancer is the most common gynecologic malignancy, and its incidence and associated mortality are increasing. Despite the immediate need to detect these cancers at an earlier stage, there is no effective screening methodology or protocol for endometrial cancer. The comprehensive, genomics-based analysis of endometrial cancer by The Cancer Genome Atlas (TCGA) revealed many of the molecular defects that define this cancer. Based on these cancer genome results, and in a prospective study, we hypothesized that the use of ultra-deep, targeted gene sequencing could detect somatic mutations in uterine lavage fluid obtained from women undergoing hysteroscopy as a means of molecular screening and diagnosis. METHODS AND FINDINGS: Uterine lavage and paired blood samples were collected and analyzed from 107 consecutive patients who were undergoing hysteroscopy and curettage for diagnostic evaluation from this single-institution study. The lavage fluid was separated into cellular and acellular fractions by centrifugation. Cellular and cell-free DNA (cfDNA) were isolated from each lavage. Two targeted next-generation sequencing (NGS) gene panels, one composed of 56 genes and the other of 12 genes, were used for ultra-deep sequencing. To rule out potential NGS-based errors, orthogonal mutation validation was performed using digital PCR and Sanger sequencing. Seven patients were diagnosed with endometrial cancer based on classic histopathologic analysis. Six of these patients had stage IA cancer, and one of these cancers was only detectable as a microscopic focus within a polyp. All seven patients were found to have significant cancer-associated gene mutations in both cell pellet and cfDNA fractions. In the four patients in whom adequate tumor sample was available, all tumor mutations above a specific allele fraction were present in the uterine lavage DNA samples. Mutations originally only detected in lavage fluid fractions were later confirmed to be present in tumor but at allele fractions significantly less than 1%. Of the remaining 95 patients diagnosed with benign or non-cancer pathology, 44 had no significant cancer mutations detected. Intriguingly, 51 patients without histopathologic evidence of cancer had relatively high allele fraction (1.0%-30.4%), cancer-associated mutations. Participants with detected driver and potential driver mutations were significantly older (mean age mutated = 57.96, 95% confidence interval [CI]: 3.30-∞, mean age no mutations = 50.35; p-value = 0.002; Benjamini-Hochberg [BH] adjusted p-value = 0.015) and more likely to be post-menopausal (p-value = 0.004; BH-adjusted p-value = 0.015) than those without these mutations. No associations were detected between mutation status and race/ethnicity, body mass index, diabetes, parity, and smoking status. Long-term follow-up was not presently available in this prospective study for those women without histopathologic evidence of cancer. CONCLUSIONS: Using ultra-deep NGS, we identified somatic mutations in DNA extracted both from cell pellets and a never previously reported cfDNA fraction from the uterine lavage. Using our targeted sequencing approach, endometrial driver mutations were identified in all seven women who received a cancer diagnosis based on classic histopathology of tissue curettage obtained at the time of hysteroscopy. In addition, relatively high allele fraction driver mutations were identified in the lavage fluid of approximately half of the women without a cancer diagnosis. Increasing age and post-menopausal status were associated with the presence of these cancer-associated mutations, suggesting the prevalent existence of a premalignant landscape in women without clinical evidence of cancer. Given that a uterine lavage can be easily and quickly performed even outside of the operating room and in a physician's office-based setting, our findings suggest the future possibility of this approach for screening women for the earliest stages of endometrial cancer. However, our findings suggest that further insight into development of cancer or its interruption are needed before translation to the clinic.


Subject(s)
DNA, Neoplasm , Endometrial Neoplasms/genetics , Genome , Mutation , Uterus/metabolism , Adult , Aged , Aged, 80 and over , Cross-Sectional Studies , Endometrial Neoplasms/pathology , Female , Humans , Middle Aged , Prospective Studies , Therapeutic Irrigation
9.
Genome Biol Evol ; 8(7): 2145-54, 2016 08 16.
Article in English | MEDLINE | ID: mdl-27324916

ABSTRACT

The mangrove rivulus (Kryptolebias marmoratus) is one of two preferentially self-fertilizing hermaphroditic vertebrates. This mode of reproduction makes mangrove rivulus an important model for evolutionary and biomedical studies because long periods of self-fertilization result in naturally homozygous genotypes that can produce isogenic lineages without significant limitations associated with inbreeding depression. Over 400 isogenic lineages currently held in laboratories across the globe show considerable among-lineage variation in physiology, behavior, and life history traits that is maintained under common garden conditions. Temperature mediates the development of primary males and also sex change between hermaphrodites and secondary males, which makes the system ideal for the study of sex determination and sexual plasticity. Mangrove rivulus also exhibit remarkable adaptations to living in extreme environments, and the system has great promise to shed light on the evolution of terrestrial locomotion, aerial respiration, and broad tolerances to hypoxia, salinity, temperature, and environmental pollutants. Genome assembly of the mangrove rivulus allows the study of genes and gene families associated with the traits described above. Here we present a de novo assembled reference genome for the mangrove rivulus, with an approximately 900 Mb genome, including 27,328 annotated, predicted, protein-coding genes. Moreover, we are able to place more than 50% of the assembled genome onto a recently published linkage map. The genome provides an important addition to the linkage map and transcriptomic tools recently developed for this species that together provide critical resources for epigenetic, transcriptomic, and proteomic analyses. Moreover, the genome will serve as the foundation for addressing key questions in behavior, physiology, toxicology, and evolutionary biology.


Subject(s)
Adaptation, Physiological , Cyprinodontiformes/genetics , Extreme Environments , Genetic Variation , Genome , Phenotype , Animals , Female , Male , Molecular Sequence Annotation
10.
PLoS Genet ; 12(3): e1005851, 2016 Mar.
Article in English | MEDLINE | ID: mdl-26943675

ABSTRACT

Controlling for background demographic effects is important for accurately identifying loci that have recently undergone positive selection. To date, the effects of demography have not yet been explicitly considered when identifying loci under selection during dog domestication. To investigate positive selection on the dog lineage early in the domestication, we examined patterns of polymorphism in six canid genomes that were previously used to infer a demographic model of dog domestication. Using an inferred demographic model, we computed false discovery rates (FDR) and identified 349 outlier regions consistent with positive selection at a low FDR. The signals in the top 100 regions were frequently centered on candidate genes related to brain function and behavior, including LHFPL3, CADM2, GRIK3, SH3GL2, MBP, PDE7B, NTAN1, and GLRA1. These regions contained significant enrichments in behavioral ontology categories. The 3rd top hit, CCRN4L, plays a major role in lipid metabolism, that is supported by additional metabolism related candidates revealed in our scan, including SCP2D1 and PDXC1. Comparing our method to an empirical outlier approach that does not directly account for demography, we found only modest overlaps between the two methods, with 60% of empirical outliers having no overlap with our demography-based outlier detection approach. Demography-aware approaches have lower-rates of false discovery. Our top candidates for selection, in addition to expanding the set of neurobehavioral candidate genes, include genes related to lipid metabolism, suggesting a dietary target of selection that was important during the period when proto-dogs hunted and fed alongside hunter-gatherers.


Subject(s)
Genetics, Population , Genomics , Lipid Metabolism/genetics , Selection, Genetic , Animals , Demography , Dogs , Genome , Polymorphism, Single Nucleotide
11.
Scand J Gastroenterol ; 50(9): 1076-87, 2015.
Article in English | MEDLINE | ID: mdl-25865706

ABSTRACT

OBJECTIVE: Breath testing and duodenal culture studies suggest that a significant proportion of irritable bowel syndrome (IBS) patients have small intestinal bacterial overgrowth. In this study, we extended these data through 16S rDNA amplicon sequencing and quantitative PCR (qPCR) analyses of duodenal aspirates from a large cohort of IBS, non-IBS and control subjects. MATERIALS AND METHODS: Consecutive subjects presenting for esophagogastroduodenoscopy only and healthy controls were recruited. Exclusion criteria included recent antibiotic or probiotic use. Following extensive medical work-up, patients were evaluated for symptoms of IBS. DNAs were isolated from duodenal aspirates obtained during endoscopy. Microbial populations in a subset of IBS subjects and controls were compared by 16S profiling. Duodenal microbes were then quantitated in the entire cohort by qPCR and the results compared with quantitative live culture data. RESULTS: A total of 258 subjects were recruited (21 healthy, 163 non-healthy non-IBS, and 74 IBS). 16S profiling in five IBS and five control subjects revealed significantly lower microbial diversity in the duodenum in IBS, with significant alterations in 12 genera (false discovery rate < 0.15), including overrepresentation of Escherichia/Shigella (p = 0.005) and Aeromonas (p = 0.051) and underrepresentation of Acinetobacter (p = 0.024), Citrobacter (p = 0.031) and Microvirgula (p = 0.036). qPCR in all 258 subjects confirmed greater levels of Escherichia coli in IBS and also revealed increases in Klebsiella spp, which correlated strongly with quantitative culture data. CONCLUSIONS: 16S rDNA sequencing confirms microbial overgrowth in the small bowel in IBS, with a concomitant reduction in diversity. qPCR supports alterations in specific microbial populations in IBS.


Subject(s)
DNA, Bacterial/analysis , DNA, Bacterial/isolation & purification , Duodenum/microbiology , Feces/microbiology , Gastrointestinal Microbiome/genetics , Irritable Bowel Syndrome/microbiology , Adult , Aged , Aged, 80 and over , Case-Control Studies , Endoscopy, Gastrointestinal , Female , Humans , Male , Middle Aged , Prospective Studies , Real-Time Polymerase Chain Reaction
12.
Science ; 346(6206): 251-6, 2014 Oct 10.
Article in English | MEDLINE | ID: mdl-25301630

ABSTRACT

Spatial and temporal dissection of the genomic changes occurring during the evolution of human non-small cell lung cancer (NSCLC) may help elucidate the basis for its dismal prognosis. We sequenced 25 spatially distinct regions from seven operable NSCLCs and found evidence of branched evolution, with driver mutations arising before and after subclonal diversification. There was pronounced intratumor heterogeneity in copy number alterations, translocations, and mutations associated with APOBEC cytidine deaminase activity. Despite maintained carcinogen exposure, tumors from smokers showed a relative decrease in smoking-related mutations over time, accompanied by an increase in APOBEC-associated mutations. In tumors from former smokers, genome-doubling occurred within a smoking-signature context before subclonal diversification, which suggested that a long period of tumor latency had preceded clinical detection. The regionally separated driver mutations, coupled with the relentless and heterogeneous nature of the genome instability processes, are likely to confound treatment success in NSCLC.


Subject(s)
Carcinoma, Non-Small-Cell Lung/diagnosis , Carcinoma, Non-Small-Cell Lung/genetics , Genetic Heterogeneity , Genomic Instability , Lung Neoplasms/diagnosis , Lung Neoplasms/genetics , APOBEC-1 Deaminase , Carcinogens/toxicity , Carcinoma, Non-Small-Cell Lung/chemically induced , Cytidine Deaminase/genetics , Evolution, Molecular , Gene Dosage , Humans , Lung Neoplasms/chemically induced , Mutation , Neoplasm Recurrence, Local/genetics , Prognosis , Smoking/adverse effects , Translocation, Genetic , Tumor Cells, Cultured
13.
Appl Environ Microbiol ; 80(24): 7583-91, 2014 Dec.
Article in English | MEDLINE | ID: mdl-25261520

ABSTRACT

High-throughput sequencing of the taxonomically informative 16S rRNA gene provides a powerful approach for exploring microbial diversity. Here we compare the performances of two common "benchtop" sequencing platforms, Illumina MiSeq and Ion Torrent Personal Genome Machine (PGM), for bacterial community profiling by 16S rRNA (V1-V2) amplicon sequencing. We benchmarked performance by using a 20-organism mock bacterial community and a collection of primary human specimens. We observed comparatively higher error rates with the Ion Torrent platform and report a pattern of premature sequence truncation specific to semiconductor sequencing. Read truncation was dependent on both the directionality of sequencing and the target species, resulting in organism-specific biases in community profiles. We found that these sequencing artifacts could be minimized by using bidirectional amplicon sequencing and an optimized flow order on the Ion Torrent platform. Results of bacterial community profiling performed on the mock community and a collection of 18 human-derived microbiological specimens were generally in good agreement for both platforms; however, in some cases, results differed significantly. Disparities could be attributed to the failure to generate full-length reads for particular organisms on the Ion Torrent platform, organism-dependent differences in sequence error rates affecting classification of certain species, or some combination of these factors. This study demonstrates the potential for differential bias in bacterial community profiles resulting from the choice of sequencing platform alone.


Subject(s)
Bacteria/isolation & purification , Bacterial Infections/microbiology , DNA, Bacterial/genetics , High-Throughput Nucleotide Sequencing/methods , RNA, Ribosomal, 16S/genetics , Bacteria/classification , Bacteria/genetics , High-Throughput Nucleotide Sequencing/instrumentation , Humans
14.
PeerJ ; 2: e520, 2014.
Article in English | MEDLINE | ID: mdl-25177534

ABSTRACT

Genomics and metagenomics have revolutionized our understanding of marine microbial ecology and the importance of microbes in global geochemical cycles. However, the process of DNA sequencing has always been an abstract extension of the research expedition, completed once the samples were returned to the laboratory. During the 2013 Southern Line Islands Research Expedition, we started the first effort to bring next generation sequencing to some of the most remote locations on our planet. We successfully sequenced twenty six marine microbial genomes, and two marine microbial metagenomes using the Ion Torrent PGM platform on the Merchant Yacht Hanse Explorer. Onboard sequence assembly, annotation, and analysis enabled us to investigate the role of the microbes in the coral reef ecology of these islands and atolls. This analysis identified phosphonate as an important phosphorous source for microbes growing in the Line Islands and reinforced the importance of L-serine in marine microbial ecosystems. Sequencing in the field allowed us to propose hypotheses and conduct experiments and further sampling based on the sequences generated. By eliminating the delay between sampling and sequencing, we enhanced the productivity of the research expedition. By overcoming the hurdles associated with sequencing on a boat in the middle of the Pacific Ocean we proved the flexibility of the sequencing, annotation, and analysis pipelines.

15.
BMC Genomics ; 15: 654, 2014 Aug 05.
Article in English | MEDLINE | ID: mdl-25096633

ABSTRACT

BACKGROUND: Vibrio cholerae is a globally dispersed pathogen that has evolved with humans for centuries, but also includes non-pathogenic environmental strains. Here, we identify the genomic variability underlying this remarkable persistence across the three major niche dimensions space, time, and habitat. RESULTS: Taking an innovative approach of genome-wide association applicable to microbial genomes (GWAS-M), we classify 274 complete V. cholerae genomes by niche, including 39 newly sequenced for this study with the Ion Torrent DNA-sequencing platform. Niche metadata were collected for each strain and analyzed together with comprehensive annotations of genetic and genomic attributes, including point mutations (single-nucleotide polymorphisms, SNPs), protein families, functions and prophages. CONCLUSIONS: Our analysis revealed that genomic variations, in particular mobile functions including phages, prophages, transposable elements, and plasmids underlie the metadata structuring in each of the three niche dimensions. This underscores the role of phages and mobile elements as the most rapidly evolving elements in bacterial genomes, creating local endemicity (space), leading to temporal divergence (time), and allowing the invasion of new habitats. Together, we take a data-driven approach for comparative functional genomics that exploits high-volume genome sequencing and annotation, in conjunction with novel statistical and machine learning analyses to identify connections between genotype and phenotype on a genome-wide scale.


Subject(s)
Genome, Bacterial , Vibrio cholerae/genetics , Cholera/epidemiology , Cholera/microbiology , DNA Transposable Elements , Environmental Microbiology , Evolution, Molecular , Genetic Variation , Genotype , Humans , Molecular Sequence Annotation , Phylogeny , Phylogeography , Polymorphism, Single Nucleotide , Sequence Analysis, DNA , Vibrio cholerae/isolation & purification
16.
PLoS Genet ; 10(5): e1004353, 2014 May.
Article in English | MEDLINE | ID: mdl-24809476

ABSTRACT

Genome sequencing of the 5,300-year-old mummy of the Tyrolean Iceman, found in 1991 on a glacier near the border of Italy and Austria, has yielded new insights into his origin and relationship to modern European populations. A key finding of that study was an apparent recent common ancestry with individuals from Sardinia, based largely on the Y chromosome haplogroup and common autosomal SNP variation. Here, we compiled and analyzed genomic datasets from both modern and ancient Europeans, including genome sequence data from over 400 Sardinians and two ancient Thracians from Bulgaria, to investigate this result in greater detail and determine its implications for the genetic structure of Neolithic Europe. Using whole-genome sequencing data, we confirm that the Iceman is, indeed, most closely related to Sardinians. Furthermore, we show that this relationship extends to other individuals from cultural contexts associated with the spread of agriculture during the Neolithic transition, in contrast to individuals from a hunter-gatherer context. We hypothesize that this genetic affinity of ancient samples from different parts of Europe with Sardinians represents a common genetic component that was geographically widespread across Europe during the Neolithic, likely related to migrations and population expansions associated with the spread of agriculture.


Subject(s)
Fossils , Genetics, Population , Genome, Human , Europe , Female , Humans , Polymorphism, Single Nucleotide
17.
Genome Res ; 24(5): 733-42, 2014 May.
Article in English | MEDLINE | ID: mdl-24760347

ABSTRACT

The somatic mutation burden in healthy white blood cells (WBCs) is not well known. Based on deep whole-genome sequencing, we estimate that approximately 450 somatic mutations accumulated in the nonrepetitive genome within the healthy blood compartment of a 115-yr-old woman. The detected mutations appear to have been harmless passenger mutations: They were enriched in noncoding, AT-rich regions that are not evolutionarily conserved, and they were depleted for genomic elements where mutations might have favorable or adverse effects on cellular fitness, such as regions with actively transcribed genes. The distribution of variant allele frequencies of these mutations suggests that the majority of the peripheral white blood cells were offspring of two related hematopoietic stem cell (HSC) clones. Moreover, telomere lengths of the WBCs were significantly shorter than telomere lengths from other tissues. Together, this suggests that the finite lifespan of HSCs, rather than somatic mutation effects, may lead to hematopoietic clonal evolution at extreme ages.


Subject(s)
Clonal Evolution , Hematopoiesis , Leukocytes/metabolism , Longevity/genetics , Mutation , AT Rich Sequence , Aged, 80 and over , Cell Lineage , Conserved Sequence , Female , Gene Frequency , Genome , Hematopoietic Stem Cells/cytology , Hematopoietic Stem Cells/metabolism , Hematopoietic Stem Cells/physiology , Humans , Leukocytes/cytology , Leukocytes/physiology , Telomere/genetics , Telomere Shortening
18.
Sci Transl Med ; 6(224): 224ra24, 2014 Feb 19.
Article in English | MEDLINE | ID: mdl-24553385

ABSTRACT

The development of noninvasive methods to detect and monitor tumors continues to be a major challenge in oncology. We used digital polymerase chain reaction-based technologies to evaluate the ability of circulating tumor DNA (ctDNA) to detect tumors in 640 patients with various cancer types. We found that ctDNA was detectable in >75% of patients with advanced pancreatic, ovarian, colorectal, bladder, gastroesophageal, breast, melanoma, hepatocellular, and head and neck cancers, but in less than 50% of primary brain, renal, prostate, or thyroid cancers. In patients with localized tumors, ctDNA was detected in 73, 57, 48, and 50% of patients with colorectal cancer, gastroesophageal cancer, pancreatic cancer, and breast adenocarcinoma, respectively. ctDNA was often present in patients without detectable circulating tumor cells, suggesting that these two biomarkers are distinct entities. In a separate panel of 206 patients with metastatic colorectal cancers, we showed that the sensitivity of ctDNA for detection of clinically relevant KRAS gene mutations was 87.2% and its specificity was 99.2%. Finally, we assessed whether ctDNA could provide clues into the mechanisms underlying resistance to epidermal growth factor receptor blockade in 24 patients who objectively responded to therapy but subsequently relapsed. Twenty-three (96%) of these patients developed one or more mutations in genes involved in the mitogen-activated protein kinase pathway. Together, these data suggest that ctDNA is a broadly applicable, sensitive, and specific biomarker that can be used for a variety of clinical and research purposes in patients with multiple different types of cancer.


Subject(s)
DNA, Neoplasm/blood , Neoplasms/blood , Adult , Aged , Aged, 80 and over , Female , Humans , Male , Middle Aged , Neoplasm Metastasis , Neoplasms/genetics , Neoplasms/pathology , Young Adult
19.
PLoS Genet ; 10(1): e1004016, 2014 Jan.
Article in English | MEDLINE | ID: mdl-24453982

ABSTRACT

To identify genetic changes underlying dog domestication and reconstruct their early evolutionary history, we generated high-quality genome sequences from three gray wolves, one from each of the three putative centers of dog domestication, two basal dog lineages (Basenji and Dingo) and a golden jackal as an outgroup. Analysis of these sequences supports a demographic model in which dogs and wolves diverged through a dynamic process involving population bottlenecks in both lineages and post-divergence gene flow. In dogs, the domestication bottleneck involved at least a 16-fold reduction in population size, a much more severe bottleneck than estimated previously. A sharp bottleneck in wolves occurred soon after their divergence from dogs, implying that the pool of diversity from which dogs arose was substantially larger than represented by modern wolf populations. We narrow the plausible range for the date of initial dog domestication to an interval spanning 11-16 thousand years ago, predating the rise of agriculture. In light of this finding, we expand upon previous work regarding the increase in copy number of the amylase gene (AMY2B) in dogs, which is believed to have aided digestion of starch in agricultural refuse. We find standing variation for amylase copy number variation in wolves and little or no copy number increase in the Dingo and Husky lineages. In conjunction with the estimated timing of dog origins, these results provide additional support to archaeological finds, suggesting the earliest dogs arose alongside hunter-gathers rather than agriculturists. Regarding the geographic origin of dogs, we find that, surprisingly, none of the extant wolf lineages from putative domestication centers is more closely related to dogs, and, instead, the sampled wolves form a sister monophyletic clade. This result, in combination with dog-wolf admixture during the process of domestication, suggests that a re-evaluation of past hypotheses regarding dog origins is necessary.


Subject(s)
Amylases/genetics , Animals, Domestic/genetics , DNA Copy Number Variations/genetics , Evolution, Molecular , Animals , DNA, Mitochondrial/genetics , Diet , Dogs , Genetic Variation , Phylogeny , Population Density , Wolves/classification , Wolves/genetics
20.
Genome Res ; 24(2): 200-11, 2014 Feb.
Article in English | MEDLINE | ID: mdl-24221193

ABSTRACT

Intra-tumor heterogeneity is a hallmark of many cancers and may lead to therapy resistance or interfere with personalized treatment strategies. Here, we combined topographic mapping of somatic breakpoints and transcriptional profiling to probe intra-tumor heterogeneity of treatment-naïve stage IIIC/IV epithelial ovarian cancer. We observed that most substantial differences in genomic rearrangement landscapes occurred between metastases in the omentum and peritoneum versus tumor sites in the ovaries. Several cancer genes such as NF1, CDKN2A, and FANCD2 were affected by lesion-specific breakpoints. Furthermore, the intra-tumor variability involved different mutational hallmarks including lesion-specific kataegis (local mutation shower coinciding with genomic breakpoints), rearrangement classes, and coding mutations. In one extreme case, we identified two independent TP53 mutations in ovary tumors and omentum/peritoneum metastases, respectively. Examination of gene expression dynamics revealed up-regulation of key cancer pathways including WNT, integrin, chemokine, and Hedgehog signaling in only subsets of tumor samples from the same patient. Finally, we took advantage of the multilevel tumor analysis to understand the effects of genomic breakpoints on qualitative and quantitative gene expression changes. We show that intra-tumor gene expression differences are caused by site-specific genomic alterations, including formation of in-frame fusion genes. These data highlight the plasticity of ovarian cancer genomes, which may contribute to their strong capacity to adapt to changing environmental conditions and give rise to the high rate of recurrent disease following standard treatment regimes.


Subject(s)
Chromosome Aberrations , Gene Expression Regulation, Neoplastic , Genome, Human , Ovarian Neoplasms/genetics , Aged , Cyclin-Dependent Kinase Inhibitor p16/genetics , Fanconi Anemia Complementation Group D2 Protein/genetics , Female , Gene Expression Profiling , Humans , Middle Aged , Neoplasm Metastasis , Neoplasm Staging , Neurofibromatosis 1/genetics , Omentum/metabolism , Omentum/pathology , Oncogene Proteins, Fusion/genetics , Ovarian Neoplasms/pathology , Peritoneum/metabolism , Peritoneum/pathology , Tumor Suppressor Protein p53/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...