Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 11 de 11
Filter
Add more filters










Publication year range
1.
Genet Med ; 21(5): 1121-1130, 2019 05.
Article in English | MEDLINE | ID: mdl-30293986

ABSTRACT

PURPOSE: Current diagnostic testing for genetic disorders involves serial use of specialized assays spanning multiple technologies. In principle, genome sequencing (GS) can detect all genomic pathogenic variant types on a single platform. Here we evaluate copy-number variant (CNV) calling as part of a clinically accredited GS test. METHODS: We performed analytical validation of CNV calling on 17 reference samples, compared the sensitivity of GS-based variants with those from a clinical microarray, and set a bound on precision using orthogonal technologies. We developed a protocol for family-based analysis of GS-based CNV calls, and deployed this across a clinical cohort of 79 rare and undiagnosed cases. RESULTS: We found that CNV calls from GS are at least as sensitive as those from microarrays, while only creating a modest increase in the number of variants interpreted (~10 CNVs per case). We identified clinically significant CNVs in 15% of the first 79 cases analyzed, all of which were confirmed by an orthogonal approach. The pipeline also enabled discovery of a uniparental disomy (UPD) and a 50% mosaic trisomy 14. Directed analysis of select CNVs enabled breakpoint level resolution of genomic rearrangements and phasing of de novo CNVs. CONCLUSION: Robust identification of CNVs by GS is possible within a clinical testing environment.


Subject(s)
DNA Copy Number Variations/genetics , Rare Diseases/genetics , Undiagnosed Diseases/genetics , Adolescent , Child , Child, Preschool , Chromosome Mapping/methods , Cohort Studies , Female , Genetic Testing/methods , Genome, Human , Genomics/methods , Humans , Infant , Male , Rare Diseases/diagnosis , Undiagnosed Diseases/diagnosis , Whole Genome Sequencing/methods , Young Adult
2.
Methods Mol Biol ; 1833: 155-168, 2018.
Article in English | MEDLINE | ID: mdl-30039371

ABSTRACT

Versatile and efficient variant calling tools are needed to analyze large-scale sequencing datasets. In particular, identification of copy number changes remains a challenging task due to their complexity, susceptibility to sequencing biases, variation in coverage data and dependence on genome-wide sample properties, such as tumor polyploidy, polyclonality in cancer samples, or frequency of de novo variation in germline genomes of pedigrees. The frequent need of core sequencing facilities to process samples from both normal and tumor sources favors multipurpose variant calling tools with functionality to process these diverse sets within a single software framework. This not only simplifies the overall bioinformatics workflow but also streamlines maintenance by shortening the software update cycle and requiring only limited staff training. Here we introduce Canvas, a tool for identification of copy number changes from diverse sequencing experiments including whole-genome matched tumor-normal, small pedigree, and single-sample normal resequencing, as well as whole-exome matched and unmatched tumor-normal studies. In addition to variant calling, Canvas infers genome-wide parameters such as cancer ploidy, purity, and heterogeneity. It provides fast and easy-to-run workflows that can scale to thousands of samples and can be easily incorporated into variant calling pipelines.


Subject(s)
DNA Copy Number Variations , Exome , Genome-Wide Association Study/methods , High-Throughput Nucleotide Sequencing/methods , Neoplasms/genetics , Polyploidy , Animals , Humans
3.
Bioinformatics ; 34(3): 516-518, 2018 02 01.
Article in English | MEDLINE | ID: mdl-29028893

ABSTRACT

Motivation: Whole genome sequencing is becoming a diagnostics of choice for the identification of rare inherited and de novo copy number variants in families with various pediatric and late-onset genetic diseases. However, joint variant calling in pedigrees is hampered by the complexity of consensus breakpoint alignment across samples within an arbitrary pedigree structure. Results: We have developed a new tool, Canvas SPW, for the identification of inherited and de novo copy number variants from pedigree sequencing data. Canvas SPW supports a number of family structures and provides a wide range of scoring and filtering options to automate and streamline identification of de novo variants. Availability and implementation: Canvas SPW is available for download from https://github.com/Illumina/canvas. Contact: sivakhno@illumina.com. Supplementary information: Supplementary data are available at Bioinformatics online.


Subject(s)
DNA Copy Number Variations , Genomics/methods , Pedigree , Sequence Analysis, DNA/methods , Software , Humans
4.
Bioinformatics ; 33(2): 280-282, 2017 01 15.
Article in English | MEDLINE | ID: mdl-27605106

ABSTRACT

MOTIVATION: Large-scale rearrangements and copy number changes combined with different modes of clonal evolution create extensive somatic genome diversity, making it difficult to develop versatile and scalable variant calling tools and create well-calibrated benchmarks. RESULTS: We developed a new simulation framework tHapMix that enables the creation of tumour samples with different ploidy, purity and polyclonality features. It easily scales to simulation of hundreds of somatic genomes, while re-use of real read data preserves noise and biases present in sequencing platforms. We further demonstrate tHapMix utility by creating a simulated set of 140 somatic genomes and showing how it can be used in training and testing of somatic copy number variant calling tools. AVAILABILITY AND IMPLEMENTATION: tHapMix is distributed under an open source license and can be downloaded from https://github.com/Illumina/tHapMix CONTACT: sivakhno@illumina.comSupplementary information: Supplementary data are available at Bioinformatics online.


Subject(s)
DNA Copy Number Variations , Genomics/methods , Haplotypes , Neoplasms/genetics , Ploidies , Software , Computer Simulation , DNA, Neoplasm , Genome , Humans
5.
Bioinformatics ; 32(15): 2375-7, 2016 08 01.
Article in English | MEDLINE | ID: mdl-27153601

ABSTRACT

MOTIVATION: Versatile and efficient variant calling tools are needed to analyze large scale sequencing datasets. In particular, identification of copy number changes remains a challenging task due to their complexity, susceptibility to sequencing biases, variation in coverage data and dependence on genome-wide sample properties, such as tumor polyploidy or polyclonality in cancer samples. RESULTS: We have developed a new tool, Canvas, for identification of copy number changes from diverse sequencing experiments including whole-genome matched tumor-normal and single-sample normal re-sequencing, as well as whole-exome matched and unmatched tumor-normal studies. In addition to variant calling, Canvas infers genome-wide parameters such as cancer ploidy, purity and heterogeneity. It provides fast and easy-to-run workflows that can scale to thousands of samples and can be easily incorporated into variant calling pipelines. AVAILABILITY AND IMPLEMENTATION: Canvas is distributed under an open source license and can be downloaded from https://github.com/Illumina/canvas CONTACT: eroller@illumina.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
DNA Copy Number Variations , Neoplasms , Software , Algorithms , Exome , Humans
6.
Nat Genet ; 47(9): 1038-1046, 2015 Sep.
Article in English | MEDLINE | ID: mdl-26192915

ABSTRACT

The molecular genetic relationship between esophageal adenocarcinoma (EAC) and its precursor lesion, Barrett's esophagus, is poorly understood. Using whole-genome sequencing on 23 paired Barrett's esophagus and EAC samples, together with one in-depth Barrett's esophagus case study sampled over time and space, we have provided the following new insights: (i) Barrett's esophagus is polyclonal and highly mutated even in the absence of dysplasia; (ii) when cancer develops, copy number increases and heterogeneity persists such that the spectrum of mutations often shows surprisingly little overlap between EAC and adjacent Barrett's esophagus; and (iii) despite differences in specific coding mutations, the mutational context suggests a common causative insult underlying these two conditions. From a clinical perspective, the histopathological assessment of dysplasia appears to be a poor reflection of the molecular disarray within the Barrett's epithelium, and a molecular Cytosponge technique overcomes sampling bias and has the capacity to reflect the entire clonal architecture.


Subject(s)
Adenocarcinoma/genetics , Barrett Esophagus/genetics , Esophageal Neoplasms/genetics , Aged , DNA Copy Number Variations , DNA Mutational Analysis , Disease Progression , Female , Genome, Human , Genome-Wide Association Study , Humans , Male , Middle Aged , Polymorphism, Single Nucleotide
7.
Cell ; 148(4): 780-91, 2012 Feb 17.
Article in English | MEDLINE | ID: mdl-22341448

ABSTRACT

The Tasmanian devil (Sarcophilus harrisii), the largest marsupial carnivore, is endangered due to a transmissible facial cancer spread by direct transfer of living cancer cells through biting. Here we describe the sequencing, assembly, and annotation of the Tasmanian devil genome and whole-genome sequences for two geographically distant subclones of the cancer. Genomic analysis suggests that the cancer first arose from a female Tasmanian devil and that the clone has subsequently genetically diverged during its spread across Tasmania. The devil cancer genome contains more than 17,000 somatic base substitution mutations and bears the imprint of a distinct mutational process. Genotyping of somatic mutations in 104 geographically and temporally distributed Tasmanian devil tumors reveals the pattern of evolution and spread of this parasitic clonal lineage, with evidence of a selective sweep in one geographical area and persistence of parallel lineages in other populations.


Subject(s)
Facial Neoplasms/veterinary , Genomic Instability , Marsupialia/genetics , Mutation , Animals , Clonal Evolution , Endangered Species , Facial Neoplasms/epidemiology , Facial Neoplasms/genetics , Facial Neoplasms/pathology , Female , Genome-Wide Association Study , Male , Molecular Sequence Data , Tasmania/epidemiology
8.
Bioinformatics ; 26(24): 3051-8, 2010 Dec 15.
Article in English | MEDLINE | ID: mdl-20966003

ABSTRACT

MOTIVATION: Copy number abnormalities (CNAs) represent an important type of genetic mutation that can lead to abnormal cell growth and proliferation. New high-throughput sequencing technologies promise comprehensive characterization of CNAs. In contrast to microarrays, where probe design follows a carefully developed protocol, reads represent a random sample from a library and may be prone to representation biases due to GC content and other factors. The discrimination between true and false positive CNAs becomes an important issue. RESULTS: We present a novel approach, called CNAseg, to identify CNAs from second-generation sequencing data. It uses depth of coverage to estimate copy number states and flowcell-to-flowcell variability in cancer and normal samples to control the false positive rate. We tested the method using the COLO-829 melanoma cell line sequenced to 40-fold coverage. An extensive simulation scheme was developed to recreate different scenarios of copy number changes and depth of coverage by altering a real dataset with spiked-in CNAs. Comparison to alternative approaches using both real and simulated datasets showed that CNAseg achieves superior precision and improved sensitivity estimates. AVAILABILITY: The CNAseg package and test data are available at http://www.compbio.group.cam.ac.uk/software.html.


Subject(s)
Algorithms , DNA Copy Number Variations , Neoplasms/genetics , Base Composition , Cell Line, Tumor , Genome, Human , Humans , Mutation , Sequence Analysis, DNA
9.
Bioinformatics ; 26(11): 1395-402, 2010 Jun 01.
Article in English | MEDLINE | ID: mdl-20403815

ABSTRACT

MOTIVATION: The current generation of single nucleotide polymorphism (SNP) arrays allows measurement of copy number aberrations (CNAs) in cancer at more than one million locations in the genome in hundreds of tumour samples. Most research has focused on single-sample CNA discovery, the so-called segmentation problem. The availability of high-density, large sample-size SNP array datasets makes the identification of recurrent copy number changes in cancer, an important issue that can be addressed using the cross-sample information. RESULTS: We present a novel approach for finding regions of recurrent copy number aberrations, called CNAnova, from Affymetrix SNP 6.0 array data. The method derives its statistical properties from a control dataset composed of normal samples and, in contrast to previous methods, does not require segmentation and permutation steps. For rigorous testing of the algorithm and comparison to existing methods, we developed a simulation scheme that uses the noise distribution present in Affymetrix arrays. Application of the method to 128 acute lymphoblastic leukaemia samples shows that CNAnova achieves lower error rate than a popular alternative approach. We also describe an extension of the CNAnova framework to identify recurrent CNA regions with intra-tumour heterogeneity, present in either primary or relapsed samples from the same patients. AVAILABILITY: The CNAnova package and synthetic datasets are available at http://www.compbio.group.cam.ac.uk/software.html.


Subject(s)
Gene Dosage/genetics , Neoplasms/genetics , Oligonucleotide Array Sequence Analysis/methods , Polymorphism, Single Nucleotide , Software , Algorithms , Databases, Genetic , Gene Expression Profiling , Humans
10.
BMC Syst Biol ; 1: 27, 2007 Jun 08.
Article in English | MEDLINE | ID: mdl-17559646

ABSTRACT

BACKGROUND: Systems wide modeling and analysis of signaling networks is essential for understanding complex cellular behaviors, such as the biphasic responses to different combinations of cytokines and growth factors. For example, tumor necrosis factor (TNF) can act as a proapoptotic or prosurvival factor depending on its concentration, the current state of signaling network and the presence of other cytokines. To understand combinatorial regulation in such systems, new computational approaches are required that can take into account non-linear interactions in signaling networks and provide tools for clustering, visualization and predictive modeling. RESULTS: Here we extended and applied an unsupervised non-linear dimensionality reduction approach, Isomap, to find clusters of similar treatment conditions in two cell signaling networks: (I) apoptosis signaling network in human epithelial cancer cells treated with different combinations of TNF, epidermal growth factor (EGF) and insulin and (II) combination of signal transduction pathways stimulated by 21 different ligands based on AfCS double ligand screen data. For the analysis of the apoptosis signaling network we used the Cytokine compendium dataset where activity and concentration of 19 intracellular signaling molecules were measured to characterise apoptotic response to TNF, EGF and insulin. By projecting the original 19-dimensional space of intracellular signals into a low-dimensional space, Isomap was able to reconstruct clusters corresponding to different cytokine treatments that were identified with graph-based clustering. In comparison, Principal Component Analysis (PCA) and Partial Least Squares - Discriminant analysis (PLS-DA) were unable to find biologically meaningful clusters. We also showed that by using Isomap components for supervised classification with k-nearest neighbor (k-NN) and quadratic discriminant analysis (QDA), apoptosis intensity can be predicted for different combinations of TNF, EGF and insulin. Prediction accuracy was highest when early activation time points in the apoptosis signaling network were used to predict apoptosis rates at later time points. Extended Isomap also outperformed PCA on the AfCS double ligand screen data. Isomap identified more functionally coherent clusters than PCA and captured more information in the first two-components. The Isomap projection performs slightly worse when more signaling networks are analyzed; suggesting that the mapping function between cues and responses becomes increasingly non-linear when large signaling pathways are considered. CONCLUSION: We developed and applied extended Isomap approach for the analysis of cell signaling networks. Potential biological applications of this method include characterization, visualization and clustering of different treatment conditions (i.e. low and high doses of TNF) in terms of changes in intracellular signaling they induce.


Subject(s)
Apoptosis , Metabolic Networks and Pathways , Models, Biological , Neural Networks, Computer , Signal Transduction , Algorithms , Cluster Analysis , Cytokines/metabolism , Discriminant Analysis , Epidermal Growth Factor/pharmacology , Humans , Insulin/pharmacology , Least-Squares Analysis , Metabolic Networks and Pathways/drug effects , Neoplasms/metabolism , Principal Component Analysis , Signal Transduction/drug effects , Tumor Necrosis Factor-alpha/pharmacology
11.
FEBS J ; 274(10): 2439-48, 2007 May.
Article in English | MEDLINE | ID: mdl-17451438

ABSTRACT

This review discusses the talks presented at the third EMBL Biennial Symposium, From functional genomics to systems biology, held in Heidelberg, Germany, 14-17 October 2006. Current issues and trends in various subfields of functional genomics and systems biology are considered, including analysis of regulatory elements, signalling networks, transcription networks, protein-protein interaction networks, genetic interaction networks, medical applications of DNA microarrays, and metagenomics. Several technological advances in the fields of DNA microarrays, identification of regulatory elements in the genomes of higher eukaryotes, and MS for detection of protein interactions are introduced. Major directions of future systems biology research are also discussed.


Subject(s)
Genomics , Systems Biology , Animals , Diagnosis , Humans , Neoplasms/diagnosis , Oligonucleotide Array Sequence Analysis , Protein Interaction Mapping , RNA Interference , Signal Transduction/physiology , Transcription Factors/physiology , Two-Hybrid System Techniques
SELECTION OF CITATIONS
SEARCH DETAIL
...