Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 20
Filter
1.
J Clin Microbiol ; 56(6)2018 06.
Article in English | MEDLINE | ID: mdl-29618499

ABSTRACT

The ability of next-generation sequencing (NGS) technologies to detect low frequency HIV-1 drug resistance mutations (DRMs) not detected by dideoxynucleotide Sanger sequencing has potential advantages for improved patient outcomes. We compared the performance of an in vitro diagnostic (IVD) NGS assay, the Sentosa SQ HIV genotyping assay for HIV-1 genotypic resistance testing, with Sanger sequencing on 138 protease/reverse transcriptase (RT) and 39 integrase sequences. The NGS assay used a 5% threshold for reporting low-frequency variants. The level of complete plus partial nucleotide sequence concordance between Sanger sequencing and NGS was 99.9%. Among the 138 protease/RT sequences, a mean of 6.4 DRMs was identified by both Sanger and NGS, a mean of 0.5 DRM was detected by NGS alone, and a mean of 0.1 DRM was detected by Sanger sequencing alone. Among the 39 integrase sequences, a mean of 1.6 DRMs was detected by both Sanger sequencing and NGS and a mean of 0.15 DRM was detected by NGS alone. Compared with Sanger sequencing, NGS estimated higher levels of resistance to one or more antiretroviral drugs for 18.2% of protease/RT sequences and 5.1% of integrase sequences. There was little evidence for technical artifacts in the NGS sequences, but the G-to-A hypermutation was detected in three samples. In conclusion, the IVD NGS assay evaluated in this study was highly concordant with Sanger sequencing. At the 5% threshold for reporting minority variants, NGS appeared to attain a modestly increased sensitivity for detecting low-frequency DRMs without compromising sequence accuracy.


Subject(s)
Drug Resistance, Viral/genetics , HIV-1/genetics , High-Throughput Nucleotide Sequencing/methods , Mutation/genetics , Anti-HIV Agents/therapeutic use , Genotype , HIV Infections/drug therapy , HIV Infections/virology , HIV Integrase/genetics , HIV Reverse Transcriptase/genetics , Humans , Microbial Sensitivity Tests , RNA, Viral/genetics , Reagent Kits, Diagnostic , Viral Load
2.
Cell Syst ; 4(5): 530-542.e6, 2017 05 24.
Article in English | MEDLINE | ID: mdl-28544881

ABSTRACT

Effective development of host cells for therapeutic protein production is hampered by the poor characterization of cellular transfection. Here, we employed a multi-omics-based systems biotechnology approach to elucidate the genotypic and phenotypic differences between a wild-type and recombinant antibody-producing Chinese hamster ovary (CHO) cell line. At the genomic level, we observed extensive rearrangements in specific targeted loci linked to transgene integration sites. Transcriptional re-wiring of DNA damage repair and cellular metabolism in the antibody producer, via changes in gene copy numbers, was also detected. Subsequent integration of transcriptomic data with a genome-scale metabolic model showed a substantial increase in energy metabolism in the antibody producer. Metabolomics, lipidomics, and glycomics analyses revealed an elevation in long-chain lipid species, potentially associated with protein transport and secretion requirements, and a surprising stability of N-glycosylation profiles between both cell lines. Overall, the proposed knowledge-based systems biotechnology framework can further accelerate mammalian cell-line engineering in a targeted manner.


Subject(s)
CHO Cells/metabolism , Recombinant Proteins/biosynthesis , Systems Biology/methods , Animals , Biotechnology/methods , Cricetulus , Gene Dosage/genetics , Genome , Glycomics , Glycosylation , Mammals/genetics , Metabolomics , Recombinant Proteins/metabolism , Transcriptome , Transfection/methods , Transgenes/genetics
3.
PLoS One ; 10(9): e0137526, 2015.
Article in English | MEDLINE | ID: mdl-26348928

ABSTRACT

Genome-wide functional analyses require high-resolution genome assembly and annotation. We applied ChIA-PET to analyze gene regulatory networks, including 3D chromosome interactions, underlying thyroid hormone (TH) signaling in the frog Xenopus tropicalis. As the available versions of Xenopus tropicalis assembly and annotation lacked the resolution required for ChIA-PET we improve the genome assembly version 4.1 and annotations using data derived from the paired end tag (PET) sequencing technologies and approaches (e.g., DNA-PET [gPET], RNA-PET etc.). The large insert (~10 Kb, ~17 Kb) paired end DNA-PET with high throughput NGS sequencing not only significantly improved genome assembly quality, but also strongly reduced genome "fragmentation", reducing total scaffold numbers by ~60%. Next, RNA-PET technology, designed and developed for the detection of full-length transcripts and fusion mRNA in whole transcriptome studies (ENCODE consortia), was applied to capture the 5' and 3' ends of transcripts. These amendments in assembly and annotation were essential prerequisites for the ChIA-PET analysis of TH transcription regulation. Their application revealed complex regulatory configurations of target genes and the structures of the regulatory networks underlying physiological responses. Our work allowed us to improve the quality of Xenopus tropicalis genomic resources, reaching the standard required for ChIA-PET analysis of transcriptional networks. We consider that the workflow proposed offers useful conceptual and methodological guidance and can readily be applied to other non-conventional models that have low-resolution genome data.


Subject(s)
Genome , Thyroid Hormones/genetics , Transcriptome/genetics , Xenopus/genetics , Animals , Chromatin/genetics , Gene Expression Regulation , Gene Regulatory Networks/genetics , Humans , Molecular Sequence Annotation , RNA/genetics , RNA, Messenger/genetics , Sequence Analysis, DNA
4.
Cell Rep ; 12(2): 272-85, 2015 Jul 14.
Article in English | MEDLINE | ID: mdl-26146084

ABSTRACT

Genome rearrangements, a hallmark of cancer, can result in gene fusions with oncogenic properties. Using DNA paired-end-tag (DNA-PET) whole-genome sequencing, we analyzed 15 gastric cancers (GCs) from Southeast Asians. Rearrangements were enriched in open chromatin and shaped by chromatin structure. We identified seven rearrangement hot spots and 136 gene fusions. In three out of 100 GC cases, we found recurrent fusions between CLDN18, a tight junction gene, and ARHGAP26, a gene encoding a RHOA inhibitor. Epithelial cell lines expressing CLDN18-ARHGAP26 displayed a dramatic loss of epithelial phenotype and long protrusions indicative of epithelial-mesenchymal transition (EMT). Fusion-positive cell lines showed impaired barrier properties, reduced cell-cell and cell-extracellular matrix adhesion, retarded wound healing, and inhibition of RHOA. Gain of invasion was seen in cancer cell lines expressing the fusion. Thus, CLDN18-ARHGAP26 mediates epithelial disintegration, possibly leading to stomach H(+) leakage, and the fusion might contribute to invasiveness once a cell is transformed.


Subject(s)
Claudins/genetics , GTPase-Activating Proteins/genetics , Oncogene Proteins, Fusion/metabolism , Stomach Neoplasms/pathology , Amino Acid Sequence , Animals , Cell Adhesion , Cell Line, Tumor , Cell Movement , Cell Proliferation , Clathrin/pharmacology , Claudins/metabolism , Dogs , Endocytosis/drug effects , Epithelial Cells/cytology , Epithelial Cells/metabolism , Epithelial-Mesenchymal Transition , GTPase-Activating Proteins/metabolism , HeLa Cells , Humans , MCF-7 Cells , Madin Darby Canine Kidney Cells , Molecular Sequence Data , Oncogene Proteins, Fusion/genetics , Phenotype , Stomach Neoplasms/metabolism , rhoA GTP-Binding Protein/antagonists & inhibitors , rhoA GTP-Binding Protein/metabolism
5.
PLoS One ; 9(6): e90852, 2014.
Article in English | MEDLINE | ID: mdl-24603971

ABSTRACT

Delineating candidate genes at the chromosomal breakpoint regions in the apparently balanced chromosome rearrangements (ABCR) has been shown to be more effective with the emergence of next-generation sequencing (NGS) technologies. We employed a large-insert (7-11 kb) paired-end tag sequencing technology (DNA-PET) to systematically analyze genome of four patients harbouring cytogenetically defined ABCR with neurodevelopmental symptoms, including developmental delay (DD) and speech disorders. We characterized structural variants (SVs) specific to each individual, including those matching the chromosomal breakpoints. Refinement of these regions by Sanger sequencing resulted in the identification of five disrupted genes in three individuals: guanine nucleotide binding protein, q polypeptide (GNAQ), RNA-binding protein, fox-1 homolog (RBFOX3), unc-5 homolog D (C.elegans) (UNC5D), transmembrane protein 47 (TMEM47), and X-linked inhibitor of apoptosis (XIAP). Among them, XIAP is the causative gene for the immunodeficiency phenotype seen in the patient. The remaining genes displayed specific expression in the fetal brain and have known biologically relevant functions in brain development, suggesting putative candidate genes for neurodevelopmental phenotypes. This study demonstrates the application of NGS technologies in mapping individual gene disruptions in ABCR as a resource for deciphering candidate genes in human neurodevelopmental disorders (NDDs).


Subject(s)
Chromosome Breakpoints , Developmental Disabilities/genetics , Language Development Disorders/genetics , Base Sequence , Chromosome Inversion , DNA Copy Number Variations , Female , Genetic Association Studies , High-Throughput Nucleotide Sequencing , Humans , Male , Molecular Sequence Data , Pedigree , Sequence Analysis, DNA , Translocation, Genetic
6.
Genome Biol ; 13(12): R115, 2012 Dec 13.
Article in English | MEDLINE | ID: mdl-23237666

ABSTRACT

BACKGROUND: Gastric cancer is the second highest cause of global cancer mortality. To explore the complete repertoire of somatic alterations in gastric cancer, we combined massively parallel short read and DNA paired-end tag sequencing to present the first whole-genome analysis of two gastric adenocarcinomas, one with chromosomal instability and the other with microsatellite instability. RESULTS: Integrative analysis and de novo assemblies revealed the architecture of a wild-type KRAS amplification, a common driver event in gastric cancer. We discovered three distinct mutational signatures in gastric cancer--against a genome-wide backdrop of oxidative and microsatellite instability-related mutational signatures, we identified the first exome-specific mutational signature. Further characterization of the impact of these signatures by combining sequencing data from 40 complete gastric cancer exomes and targeted screening of an additional 94 independent gastric tumors uncovered ACVR2A, RPL22 and LMAN1 as recurrently mutated genes in microsatellite instability-positive gastric cancer and PAPPA as a recurrently mutated gene in TP53 wild-type gastric cancer. CONCLUSIONS: These results highlight how whole-genome cancer sequencing can uncover information relevant to tissue-specific carcinogenesis that would otherwise be missed from exome-sequencing data.


Subject(s)
DNA Mutational Analysis/methods , High-Throughput Nucleotide Sequencing/methods , Stomach Neoplasms/genetics , Adenocarcinoma/genetics , Chromosomal Instability , Deamination , Exome , Genomics , Microsatellite Instability , Mutation , Reactive Oxygen Species/metabolism
7.
PLoS One ; 7(9): e46152, 2012.
Article in English | MEDLINE | ID: mdl-23029419

ABSTRACT

Structural variations (SVs) contribute significantly to the variability of the human genome and extensive genomic rearrangements are a hallmark of cancer. While genomic DNA paired-end-tag (DNA-PET) sequencing is an attractive approach to identify genomic SVs, the current application of PET sequencing with short insert size DNA can be insufficient for the comprehensive mapping of SVs in low complexity and repeat-rich genomic regions. We employed a recently developed procedure to generate PET sequencing data using large DNA inserts of 10-20 kb and compared their characteristics with short insert (1 kb) libraries for their ability to identify SVs. Our results suggest that although short insert libraries bear an advantage in identifying small deletions, they do not provide significantly better breakpoint resolution. In contrast, large inserts are superior to short inserts in providing higher physical genome coverage for the same sequencing cost and achieve greater sensitivity, in practice, for the identification of several classes of SVs, such as copy number neutral and complex events. Furthermore, our results confirm that large insert libraries allow for the identification of SVs within repetitive sequences, which cannot be spanned by short inserts. This provides a key advantage in studying rearrangements in cancer, and we show how it can be used in a fusion-point-guided-concatenation algorithm to study focally amplified regions in cancer.


Subject(s)
Genome, Human , Genomic Structural Variation , Mutation , Neoplasms/genetics , Open Reading Frames , Sequence Analysis, DNA/methods , Algorithms , Cell Line, Tumor , Chromosome Mapping , DNA Copy Number Variations , Genomic Library , Humans , Mutagenesis, Insertional
8.
Nat Genet ; 44(7): 765-9, 2012 May 27.
Article in English | MEDLINE | ID: mdl-22634754

ABSTRACT

To survey hepatitis B virus (HBV) integration in liver cancer genomes, we conducted massively parallel sequencing of 81 HBV-positive and 7 HBV-negative hepatocellular carcinomas (HCCs) and adjacent normal tissues. We found that HBV integration is observed more frequently in the tumors (86.4%) than in adjacent liver tissues (30.7%). Copy-number variations (CNVs) were significantly increased at HBV breakpoint locations where chromosomal instability was likely induced. Approximately 40% of HBV breakpoints within the HBV genome were located within a 1,800-bp region where the viral enhancer, X gene and core gene are located. We also identified recurrent HBV integration events (in ≥ 4 HCCs) that were validated by RNA sequencing (RNA-seq) and Sanger sequencing at the known and putative cancer-related TERT, MLL4 and CCNE1 genes, which showed upregulated gene expression in tumor versus normal tissue. We also report evidence that suggests that the number of HBV integrations is associated with patient survival.


Subject(s)
Carcinoma, Hepatocellular/genetics , Carcinoma, Hepatocellular/virology , Hepatitis B virus/genetics , Liver Neoplasms/genetics , Liver Neoplasms/virology , Virus Integration/genetics , Base Sequence , Chromosomal Instability/genetics , Cyclin E/genetics , DNA Copy Number Variations/genetics , DNA, Viral/genetics , DNA-Binding Proteins/genetics , Female , Histone-Lysine N-Methyltransferase , Humans , Male , Middle Aged , Molecular Sequence Data , Oncogene Proteins/genetics , RNA, Viral/genetics , Survival Rate , Telomerase/genetics
9.
Nat Med ; 18(4): 521-8, 2012 Mar 18.
Article in English | MEDLINE | ID: mdl-22426421

ABSTRACT

Tyrosine kinase inhibitors (TKIs) elicit high response rates among individuals with kinase-driven malignancies, including chronic myeloid leukemia (CML) and epidermal growth factor receptor-mutated non-small-cell lung cancer (EGFR NSCLC). However, the extent and duration of these responses are heterogeneous, suggesting the existence of genetic modifiers affecting an individual's response to TKIs. Using paired-end DNA sequencing, we discovered a common intronic deletion polymorphism in the gene encoding BCL2-like 11 (BIM). BIM is a pro-apoptotic member of the B-cell CLL/lymphoma 2 (BCL2) family of proteins, and its upregulation is required for TKIs to induce apoptosis in kinase-driven cancers. The polymorphism switched BIM splicing from exon 4 to exon 3, which resulted in expression of BIM isoforms lacking the pro-apoptotic BCL2-homology domain 3 (BH3). The polymorphism was sufficient to confer intrinsic TKI resistance in CML and EGFR NSCLC cell lines, but this resistance could be overcome with BH3-mimetic drugs. Notably, individuals with CML and EGFR NSCLC harboring the polymorphism experienced significantly inferior responses to TKIs than did individuals without the polymorphism (P = 0.02 for CML and P = 0.027 for EGFR NSCLC). Our results offer an explanation for the heterogeneity of TKI responses across individuals and suggest the possibility of personalizing therapy with BH3 mimetics to overcome BIM-polymorphism-associated TKI resistance.


Subject(s)
Apoptosis Regulatory Proteins/genetics , Apoptosis/drug effects , Carcinoma, Non-Small-Cell Lung/genetics , Drug Resistance, Neoplasm/drug effects , Leukemia, Myelogenous, Chronic, BCR-ABL Positive/genetics , Lung Neoplasms/genetics , Membrane Proteins/genetics , Polymorphism, Genetic/genetics , Protein Kinase Inhibitors/pharmacology , Proto-Oncogene Proteins/genetics , Sequence Deletion/genetics , Adult , Aged , Aged, 80 and over , Annexins/metabolism , BH3 Interacting Domain Death Agonist Protein/genetics , Bcl-2-Like Protein 11 , Carcinoma, Non-Small-Cell Lung/drug therapy , Cell Line, Tumor , Cohort Studies , Dose-Response Relationship, Drug , Drug Resistance, Neoplasm/genetics , Enzyme-Linked Immunosorbent Assay/methods , ErbB Receptors/genetics , Exons/genetics , Female , Follow-Up Studies , Gene Expression Regulation, Neoplastic/drug effects , Gene Frequency , Genotype , Humans , International Cooperation , Leukemia, Myelogenous, Chronic, BCR-ABL Positive/drug therapy , Lung Neoplasms/drug therapy , Male , Middle Aged , Protein Isoforms/genetics , Protein Isoforms/metabolism , RNA, Small Interfering/metabolism , Statistics, Nonparametric , Transfection
10.
Genome Res ; 21(12): 2224-41, 2011 Dec.
Article in English | MEDLINE | ID: mdl-21926179

ABSTRACT

Low-cost short read sequencing technology has revolutionized genomics, though it is only just becoming practical for the high-quality de novo assembly of a novel large genome. We describe the Assemblathon 1 competition, which aimed to comprehensively assess the state of the art in de novo assembly methods when applied to current sequencing technologies. In a collaborative effort, teams were asked to assemble a simulated Illumina HiSeq data set of an unknown, simulated diploid genome. A total of 41 assemblies from 17 different groups were received. Novel haplotype aware assessments of coverage, contiguity, structure, base calling, and copy number were made. We establish that within this benchmark: (1) It is possible to assemble the genome to a high level of coverage and accuracy, and that (2) large differences exist between the assemblies, suggesting room for further improvements in current methods. The simulated benchmark, including the correct answer, the assemblies, and the code that was used to evaluate the assemblies is now public and freely available from http://www.assemblathon.org/.


Subject(s)
Genome/physiology , Genomics/methods , Sequence Analysis, DNA/methods
11.
Genome Res ; 21(5): 676-87, 2011 May.
Article in English | MEDLINE | ID: mdl-21467264

ABSTRACT

Using a long-span, paired-end deep sequencing strategy, we have comprehensively identified cancer genome rearrangements in eight breast cancer genomes. Herein, we show that 40%-54% of these structural genomic rearrangements result in different forms of fusion transcripts and that 44% are potentially translated. We find that single segmental tandem duplication spanning several genes is a major source of the fusion gene transcripts in both cell lines and primary tumors involving adjacent genes placed in the reverse-order position by the duplication event. Certain other structural mutations, however, tend to attenuate gene expression. From these candidate gene fusions, we have found a fusion transcript (RPS6KB1-VMP1) recurrently expressed in ∼30% of breast cancers associated with potential clinical consequences. This gene fusion is caused by tandem duplication on 17q23 and appears to be an indicator of local genomic instability altering the expression of oncogenic components such as MIR21 and RPS6KB1.


Subject(s)
Breast Neoplasms/metabolism , Gene Rearrangement , Genome, Human/genetics , Membrane Proteins/genetics , Membrane Proteins/metabolism , Recombinant Fusion Proteins/metabolism , Ribosomal Protein S6 Kinases/metabolism , Transcription, Genetic , Breast Neoplasms/genetics , Cell Line, Tumor , Chromosome Mapping , Chromosomes, Human, Pair 17/genetics , Female , Gene Dosage , Gene Expression Profiling , Genomic Instability , High-Throughput Nucleotide Sequencing , Humans , Recombinant Fusion Proteins/genetics , Ribosomal Protein S6 Kinases/genetics , Sequence Analysis, DNA
12.
Genome Res ; 21(5): 665-75, 2011 May.
Article in English | MEDLINE | ID: mdl-21467267

ABSTRACT

Somatic genome rearrangements are thought to play important roles in cancer development. We optimized a long-span paired-end-tag (PET) sequencing approach using 10-Kb genomic DNA inserts to study human genome structural variations (SVs). The use of a 10-Kb insert size allows the identification of breakpoints within repetitive or homology-containing regions of a few kilobases in size and results in a higher physical coverage compared with small insert libraries with the same sequencing effort. We have applied this approach to comprehensively characterize the SVs of 15 cancer and two noncancer genomes and used a filtering approach to strongly enrich for somatic SVs in the cancer genomes. Our analyses revealed that most inversions, deletions, and insertions are germ-line SVs, whereas tandem duplications, unpaired inversions, interchromosomal translocations, and complex rearrangements are over-represented among somatic rearrangements in cancer genomes. We demonstrate that the quantitative and connective nature of DNA-PET data is precise in delineating the genealogy of complex rearrangement events, we observe signatures that are compatible with breakage-fusion-bridge cycles, and we discover that large duplications are among the initial rearrangements that trigger genome instability for extensive amplification in epithelial cancers.


Subject(s)
Base Pairing/genetics , Breast Neoplasms/genetics , Chromosome Mapping/methods , Genome, Human/genetics , Genomic Structural Variation/genetics , Stomach Neoplasms/genetics , Cell Line, Tumor , Computational Biology , DNA/genetics , Female , Gene Rearrangement , Humans , Sequence Analysis, DNA
13.
Bioinformatics ; 27(2): 167-74, 2011 Jan 15.
Article in English | MEDLINE | ID: mdl-21149345

ABSTRACT

MOTIVATION: Many de novo genome assemblers have been proposed recently. The basis for most existing methods relies on the de bruijn graph: a complex graph structure that attempts to encompass the entire genome. Such graphs can be prohibitively large, may fail to capture subtle information and is difficult to be parallelized. RESULT: We present a method that eschews the traditional graph-based approach in favor of a simple 3' extension approach that has potential to be massively parallelized. Our results show that it is able to obtain assemblies that are more contiguous, complete and less error prone compared with existing methods. AVAILABILITY: The software package can be found at http://www.comp.nus.edu.sg/~bioinfo/peasm/. Alternatively it is available from authors upon request.


Subject(s)
Genomics/methods , Sequence Analysis, DNA/methods , Computer Simulation , Escherichia coli/genetics , Genome , Schizosaccharomyces/genetics , Software
14.
Genome Biol ; 11(8): R89, 2010.
Article in English | MEDLINE | ID: mdl-20799932

ABSTRACT

BACKGROUND: Burkholderia thailandensis is a non-pathogenic environmental saprophyte closely related to Burkholderia pseudomallei, the causative agent of the often fatal animal and human disease melioidosis. To study B. thailandensis genomic variation, we profiled 50 isolates using a pan-genome microarray comprising genomic elements from 28 Burkholderia strains and species. RESULTS: Of 39 genomic regions variably present across the B. thailandensis strains, 13 regions corresponded to known genomic islands, while 26 regions were novel. Variant B. thailandensis isolates exhibited isolated acquisition of a capsular polysaccharide biosynthesis gene cluster (B. pseudomallei-like capsular polysaccharide) closely resembling a similar cluster in B. pseudomallei that is essential for virulence in mammals; presence of this cluster was confirmed by whole genome sequencing of a representative variant strain (B. thailandensis E555). Both whole-genome microarray and multi-locus sequence typing analysis revealed that the variant strains formed part of a phylogenetic subgroup distinct from the ancestral B. thailandensis population and were associated with atypical isolation sources when compared to the majority of previously described B. thailandensis strains. In functional assays, B. thailandensis E555 exhibited several B. pseudomallei-like phenotypes, including colony wrinkling, resistance to human complement binding, and intracellular macrophage survival. However, in murine infection assays, B. thailandensis E555 did not exhibit enhanced virulence relative to other B. thailandensis strains, suggesting that additional factors are required to successfully colonize and infect mammals. CONCLUSIONS: The discovery of such novel variant strains demonstrates how unbiased genomic surveys of non-pathogenic isolates can reveal insights into the development and emergence of new pathogenic species.


Subject(s)
Burkholderia/genetics , Burkholderia/pathogenicity , Genome, Bacterial , Multigene Family , Animals , Burkholderia/isolation & purification , Burkholderia Infections/immunology , Genetic Speciation , Genetic Variation , Humans , Metabolic Networks and Pathways/genetics , Mice , Polysaccharides, Bacterial/biosynthesis , Virulence/genetics
15.
PLoS Pathog ; 6(4): e1000845, 2010 Apr 01.
Article in English | MEDLINE | ID: mdl-20368977

ABSTRACT

Certain environmental microorganisms can cause severe human infections, even in the absence of an obvious requirement for transition through an animal host for replication ("accidental virulence"). To understand this process, we compared eleven isolate genomes of Burkholderia pseudomallei (Bp), a tropical soil microbe and causative agent of the human and animal disease melioidosis. We found evidence for the existence of several new genes in the Bp reference genome, identifying 282 novel genes supported by at least two independent lines of supporting evidence (mRNA transcripts, database homologs, and presence of ribosomal binding sites) and 81 novel genes supported by all three lines. Within the Bp core genome, 211 genes exhibited significant levels of positive selection (4.5%), distributed across many cellular pathways including carbohydrate and secondary metabolism. Functional experiments revealed that certain positively selected genes might enhance mammalian virulence by interacting with host cellular pathways or utilizing host nutrients. Evolutionary modifications improving Bp environmental fitness may thus have indirectly facilitated the ability of Bp to colonize and survive in mammalian hosts. These findings improve our understanding of the pathogenesis of melioidosis, and establish Bp as a model system for studying the genetics of accidental virulence.


Subject(s)
Biological Evolution , Burkholderia pseudomallei/genetics , Burkholderia pseudomallei/pathogenicity , Genes, Bacterial , Animals , Base Sequence , Female , Fluorescent Antibody Technique , Gene Expression Profiling , Genome, Bacterial , Melioidosis/genetics , Mice , Mice, Inbred BALB C , Molecular Sequence Data , Virulence/genetics
16.
Genome Biol ; 11(2): R22, 2010.
Article in English | MEDLINE | ID: mdl-20181287

ABSTRACT

Chromatin interaction analysis with paired-end tag sequencing (ChIA-PET) is a new technology to study genome-wide long-range chromatin interactions bound by protein factors. Here we present ChIA-PET Tool, a software package for automatic processing of ChIA-PET sequence data, including linker filtering, mapping tags to reference genomes, identifying protein binding sites and chromatin interactions, and displaying the results on a graphical genome browser. ChIA-PET Tool is fast, accurate, comprehensive, user-friendly, and open source (available at http://chiapet.gis.a-star.edu.sg).


Subject(s)
Chromatin Immunoprecipitation , Chromatin/metabolism , Sequence Analysis, DNA/methods , Software , Binding Sites/genetics , Chromatin/chemistry , DNA-Binding Proteins/chemistry , DNA-Binding Proteins/metabolism , Genome, Human , Humans , Protein Binding
17.
Nature ; 462(7269): 58-64, 2009 Nov 05.
Article in English | MEDLINE | ID: mdl-19890323

ABSTRACT

Genomes are organized into high-level three-dimensional structures, and DNA elements separated by long genomic distances can in principle interact functionally. Many transcription factors bind to regulatory DNA elements distant from gene promoters. Although distal binding sites have been shown to regulate transcription by long-range chromatin interactions at a few loci, chromatin interactions and their impact on transcription regulation have not been investigated in a genome-wide manner. Here we describe the development of a new strategy, chromatin interaction analysis by paired-end tag sequencing (ChIA-PET) for the de novo detection of global chromatin interactions, with which we have comprehensively mapped the chromatin interaction network bound by oestrogen receptor alpha (ER-alpha) in the human genome. We found that most high-confidence remote ER-alpha-binding sites are anchored at gene promoters through long-range chromatin interactions, suggesting that ER-alpha functions by extensive chromatin looping to bring genes together for coordinated transcriptional regulation. We propose that chromatin interactions constitute a primary mechanism for regulating transcription in mammalian genomes.


Subject(s)
Chromatin/genetics , Chromatin/metabolism , Estrogen Receptor alpha/metabolism , Genome, Human/genetics , Binding Sites , Cell Line , Chromatin Immunoprecipitation , Cross-Linking Reagents , Formaldehyde , Humans , Promoter Regions, Genetic/genetics , Protein Binding , Reproducibility of Results , Sequence Analysis, DNA , Transcription, Genetic , Transcriptional Activation
18.
Genome Res ; 17(6): 828-38, 2007 Jun.
Article in English | MEDLINE | ID: mdl-17568001

ABSTRACT

Identification of unconventional functional features such as fusion transcripts is a challenging task in the effort to annotate all functional DNA elements in the human genome. Paired-End diTag (PET) analysis possesses a unique capability to accurately and efficiently characterize the two ends of DNA fragments, which may have either normal or unusual compositions. This unique nature of PET analysis makes it an ideal tool for uncovering unconventional features residing in the human genome. Using the PET approach for comprehensive transcriptome analysis, we were able to identify fusion transcripts derived from genome rearrangements and actively expressed retrotransposed pseudogenes, which would be difficult to capture by other means. Here, we demonstrate this unique capability through the analysis of 865,000 individual transcripts in two types of cancer cells. In addition to the characterization of a large number of differentially expressed alternative 5' and 3' transcript variants and novel transcriptional units, we identified 70 fusion transcript candidates in this study. One was validated as the product of a fusion gene between BCAS4 and BCAS3 resulting from an amplification followed by a translocation event between the two loci, chr20q13 and chr17q23. Through an examination of PETs that mapped to multiple genomic locations, we identified 4055 retrotransposed loci in the human genome, of which at least three were found to be transcriptionally active. The PET mapping strategy presented here promises to be a useful tool in annotating the human genome, especially aberrations in human cancer genomes.


Subject(s)
Chromosomes, Human, Pair 17/genetics , Chromosomes, Human, Pair 20/genetics , Genome, Human , Neoplasms/genetics , Transcription, Genetic , Translocation, Genetic , Cell Line, Tumor , Humans , Neoplasm Proteins/genetics , Quantitative Trait Loci , Retroelements , Sequence Analysis, DNA
19.
BMC Cancer ; 7: 109, 2007 Jun 26.
Article in English | MEDLINE | ID: mdl-17594473

ABSTRACT

BACKGROUND: Melanoma is the major cause of skin cancer deaths and melanoma incidence doubles every 10 to 20 years. However, little is known about melanoma pathway aberrations. Here we applied the robust Gene Identification Signature Paired End diTag (GIS-PET) approach to investigate the melanoma transcriptome and characterize the global pathway aberrations. METHODS: GIS-PET technology directly links 5' mRNA signatures with their corresponding 3' signatures to generate, and then concatenate, PETs for efficient sequencing. We annotated PETs to pathways of KEGG database and compared the murine B16F1 melanoma transcriptome with three non-melanoma murine transcriptomes (Melan-a2 melanocytes, E14 embryonic stem cells, and E17.5 embryo). Gene expression levels as represented by PET counts were compared across melanoma and melanocyte libraries to identify the most significantly altered pathways and investigate the expression levels of crucial cancer genes. RESULTS: Melanin biosynthesis genes were solely expressed in the cells of melanocytic origin, indicating the feasibility of using the PET approach for transcriptome comparison. The most significantly altered pathways were metabolic pathways, including upregulated pathways: purine metabolism, aminophosphonate metabolism, tyrosine metabolism, selenoamino acid metabolism, galactose utilization, nitrobenzene degradation, and bisphenol A degradation; and downregulated pathways: oxidative phosphorylation, ATPase synthesis, TCA cycle, pyruvate metabolism, and glutathione metabolism. The downregulated pathways concurrently indicated a slowdown of mitochondrial activities. Mitochondrial permeability was also significantly altered, as indicated by transcriptional activation of ATP/ADP, citrate/malate, Mg++, fatty acid and amino acid transporters, and transcriptional repression of zinc and metal ion transporters. Upregulation of cell cycle progression, MAPK, and PI3K/Akt pathways were more limited to certain region(s) of the pathway. Expression levels of c-Myc and Trp53 were also higher in melanoma. Moreover, transcriptional variants resulted from alternative transcription start sites or alternative polyadenylation sites were found in Ras and genes encoding adhesion or cytoskeleton proteins such as integrin, beta-catenin, alpha-catenin, and actin. CONCLUSION: The highly correlated results unmistakably point to a systematic downregulation of mitochondrial activities, which we hypothesize aims to downgrade the mitochondria-mediated apoptosis and the dependency of cancer cells on angiogenesis. Our results also demonstrate the advantage of using the PET approach in conjunction with KEGG database for systematic pathway analysis.


Subject(s)
Gene Expression Profiling/methods , Melanins/biosynthesis , Melanoma/genetics , Oligonucleotide Array Sequence Analysis/methods , Skin Neoplasms/genetics , Animals , Apoptosis/genetics , Biosynthetic Pathways/genetics , Disease Models, Animal , Gene Expression Regulation, Neoplastic/genetics , Melanins/genetics , Melanoma/pathology , Mice , RNA, Messenger/genetics , Regulatory Elements, Transcriptional/genetics , Sensitivity and Specificity , Skin Neoplasms/pathology , Tumor Cells, Cultured
20.
BMC Bioinformatics ; 7: 390, 2006 Aug 25.
Article in English | MEDLINE | ID: mdl-16934139

ABSTRACT

BACKGROUND: We recently developed the Paired End diTag (PET) strategy for efficient characterization of mammalian transcriptomes and genomes. The paired end nature of short PET sequences derived from long DNA fragments raised a new set of bioinformatics challenges, including how to extract PETs from raw sequence reads, and correctly yet efficiently map PETs to reference genome sequences. To accommodate and streamline data analysis of the large volume PET sequences generated from each PET experiment, an automated PET data process pipeline is desirable. RESULTS: We designed an integrated computation program package, PET-Tool, to automatically process PET sequences and map them to the genome sequences. The Tool was implemented as a web-based application composed of four modules: the Extractor module for PET extraction; the Examiner module for analytic evaluation of PET sequence quality; the Mapper module for locating PET sequences in the genome sequences; and the Project Manager module for data organization. The performance of PET-Tool was evaluated through the analyses of 2.7 million PET sequences. It was demonstrated that PET-Tool is accurate and efficient in extracting PET sequences and removing artifacts from large volume dataset. Using optimized mapping criteria, over 70% of quality PET sequences were mapped specifically to the genome sequences. With a 2.4 GHz LINUX machine, it takes approximately six hours to process one million PETs from extraction to mapping. CONCLUSION: The speed, accuracy, and comprehensiveness have proved that PET-Tool is an important and useful component in PET experiments, and can be extended to accommodate other related analyses of paired-end sequences. The Tool also provides user-friendly functions for data quality check and system for multi-layer data management.


Subject(s)
Genome/genetics , Sequence Analysis, DNA/methods , Software , Transcription, Genetic/genetics , Animals , Base Sequence , Computational Biology/methods , Databases, Nucleic Acid , Humans , Molecular Sequence Data
SELECTION OF CITATIONS
SEARCH DETAIL
...