Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 23
Filter
1.
Diagn Microbiol Infect Dis ; 101(3): 115508, 2021 Nov.
Article in English | MEDLINE | ID: mdl-34391075

ABSTRACT

We introduce a target capture next-generation sequencing methodology, the ONETest Coronaviruses Plus, to sequence the SARS-CoV-2 genome and select loci of other respiratory viruses. We applied the ONETest on 70 respiratory samples (collected in Florida, USA between May and July, 2020), in which SARS-CoV-2 had been detected by a PCR assay. For 48 of the samples, we also applied the ARTIC protocol. Of the 70 ONETest libraries, 45 (64%) had a (near-)complete sequence (>29,000 bases and >90% covered by >9 reads). Of the 48 ARTIC libraries, 25 (52%) had a (near-)complete sequence. In 19 out of 25 (76%) samples in which both the ONETest and ARTIC yielded (near-)complete sequences, the lineages assigned were identical. As a target capture approach, the ONETest is less prone to loss of sequence coverage than amplicon approaches, and thus can provide complete genomic information more often to track and monitor SARS-CoV-2 variants.


Subject(s)
COVID-19/diagnosis , COVID-19/virology , Genome, Viral , Genomics/methods , SARS-CoV-2/genetics , Adult , Female , Humans , Male , Middle Aged , Polymerase Chain Reaction/methods , Retrospective Studies
2.
J Pathol ; 249(2): 173-181, 2019 10.
Article in English | MEDLINE | ID: mdl-31187483

ABSTRACT

The advent of next generation sequencing has vastly improved the resolution of mutation detection, thereby both increasing the resolution of the analysis of cancer tissues and shining light on the existence of somatic driver mutations in normal tissues, even in the absence of cancer. Studies have described somatic driver mutations in normal skin, blood, peritoneal washings, and esophageal epithelium. Such findings prompt speculation on whether such mutations exist in other tissues, such as the eutopic endometrium in particular, due to the highly regenerative nature of the endometrium and the recent observation of recurrent somatic driver mutations in deep infiltrating and iatrogenic endometriosis (tissues believed to be derived from the eutopic endometrium) by our group and others. In the current study we investigated the presence of somatic driver mutations in histologically normal endometrium from women lacking evidence of gynecologic malignancy or endometrial hyperplasia. Twenty-five women who underwent hysterectomies and 85 women who underwent endometrial biopsies were included in this study. Formalin-fixed, paraffin-embedded tissue specimens were analyzed by means of targeted sequencing followed by orthogonal validation with droplet digital PCR. PTEN and ARID1A immunohistochemistry (IHC) was also performed as surrogates for inactivating mutations in the respective genes. Overall, we observed somatic driver-like events in over 50% of normal endometrial samples analyzed, including hotspot mutations in KRAS, PIK3CA, and FGFR2 as well as PTEN-loss by IHC. Analysis of anterior and posterior samplings collected from women who underwent hysterectomies was consistent with the presence of somatic driver mutations within clonal pockets spread throughout the uterus. The prevalence of such oncogenic mutations also increased with age (OR: 1.05 [95% CI: 1.00-1.10], p = 0.035). These findings have implications on our understanding of aging and so-called 'normal tissues', thereby necessitating caution in the utilization of mutation-based early detection tools for endometrial or other cancers. © 2019 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd.


Subject(s)
Aging/genetics , DNA Mutational Analysis , Early Detection of Cancer/methods , Endometrial Neoplasms/genetics , Endometrium/metabolism , Mutation , Oncogenes , Adult , Age Factors , Aging/metabolism , Endometrial Neoplasms/metabolism , Endometrial Neoplasms/pathology , Female , Healthy Volunteers , Humans , Middle Aged , Mutation Rate , Predictive Value of Tests , Reproducibility of Results , Young Adult
3.
Nat Genet ; 48(7): 758-67, 2016 07.
Article in English | MEDLINE | ID: mdl-27182968

ABSTRACT

We performed phylogenetic analysis of high-grade serous ovarian cancers (68 samples from seven patients), identifying constituent clones and quantifying their relative abundances at multiple intraperitoneal sites. Through whole-genome and single-nucleus sequencing, we identified evolutionary features including mutation loss, convergence of the structural genome and temporal activation of mutational processes that patterned clonal progression. We then determined the precise clonal mixtures comprising each tumor sample. The majority of sites were clonally pure or composed of clones from a single phylogenetic clade. However, each patient contained at least one site composed of polyphyletic clones. Five patients exhibited monoclonal and unidirectional seeding from the ovary to intraperitoneal sites, and two patients demonstrated polyclonal spread and reseeding. Our findings indicate that at least two distinct modes of intraperitoneal spread operate in clonal dissemination and highlight the distribution of migratory potential over clonal populations comprising high-grade serous ovarian cancers.


Subject(s)
Biomarkers, Tumor/genetics , Clone Cells/pathology , Cystadenocarcinoma, Serous/pathology , Genetic Variation/genetics , Ovarian Neoplasms/pathology , Peritoneal Neoplasms/pathology , Tumor Microenvironment/genetics , Aged , Clone Cells/metabolism , Cystadenocarcinoma, Serous/genetics , Disease Progression , Fallopian Tube Neoplasms/genetics , Fallopian Tube Neoplasms/pathology , Female , Gene Expression Profiling , Gene Expression Regulation, Neoplastic , Genome, Human , High-Throughput Nucleotide Sequencing/methods , Humans , Middle Aged , Mutation/genetics , Neoplasm Grading , Neoplasm Recurrence, Local/genetics , Neoplasm Recurrence, Local/pathology , Ovarian Neoplasms/genetics , Peritoneal Neoplasms/genetics , Phylogeny , Single-Cell Analysis/methods , Survival Rate
4.
Nature ; 518(7539): 422-6, 2015 Feb 19.
Article in English | MEDLINE | ID: mdl-25470049

ABSTRACT

Human cancers, including breast cancers, comprise clones differing in mutation content. Clones evolve dynamically in space and time following principles of Darwinian evolution, underpinning important emergent features such as drug resistance and metastasis. Human breast cancer xenoengraftment is used as a means of capturing and studying tumour biology, and breast tumour xenografts are generally assumed to be reasonable models of the originating tumours. However, the consequences and reproducibility of engraftment and propagation on the genomic clonal architecture of tumours have not been systematically examined at single-cell resolution. Here we show, using deep-genome and single-cell sequencing methods, the clonal dynamics of initial engraftment and subsequent serial propagation of primary and metastatic human breast cancers in immunodeficient mice. In all 15 cases examined, clonal selection on engraftment was observed in both primary and metastatic breast tumours, varying in degree from extreme selective engraftment of minor (<5% of starting population) clones to moderate, polyclonal engraftment. Furthermore, ongoing clonal dynamics during serial passaging is a feature of tumours experiencing modest initial selection. Through single-cell sequencing, we show that major mutation clusters estimated from tumour population sequencing relate predictably to the most abundant clonal genotypes, even in clonally complex and rapidly evolving cases. Finally, we show that similar clonal expansion patterns can emerge in independent grafts of the same starting tumour population, indicating that genomic aberrations can be reproducible determinants of evolutionary trajectories. Our results show that measurement of genomically defined clonal population dynamics will be highly informative for functional studies using patient-derived breast cancer xenoengraftment.


Subject(s)
Breast Neoplasms/genetics , Breast Neoplasms/pathology , Clone Cells/metabolism , Clone Cells/pathology , Genome, Human/genetics , Single-Cell Analysis , Xenograft Model Antitumor Assays , Animals , Breast Neoplasms/secondary , DNA Mutational Analysis , Genomics , Genotype , High-Throughput Nucleotide Sequencing , Humans , Mice , Neoplasm Transplantation , Time Factors , Transplantation, Heterologous , Xenograft Model Antitumor Assays/methods
5.
Genome Res ; 24(11): 1881-93, 2014 Nov.
Article in English | MEDLINE | ID: mdl-25060187

ABSTRACT

The evolution of cancer genomes within a single tumor creates mixed cell populations with divergent somatic mutational landscapes. Inference of tumor subpopulations has been disproportionately focused on the assessment of somatic point mutations, whereas computational methods targeting evolutionary dynamics of copy number alterations (CNA) and loss of heterozygosity (LOH) in whole-genome sequencing data remain underdeveloped. We present a novel probabilistic model, TITAN, to infer CNA and LOH events while accounting for mixtures of cell populations, thereby estimating the proportion of cells harboring each event. We evaluate TITAN on idealized mixtures, simulating clonal populations from whole-genome sequences taken from genomically heterogeneous ovarian tumor sites collected from the same patient. In addition, we show in 23 whole genomes of breast tumors that the inference of CNA and LOH using TITAN critically informs population structure and the nature of the evolving cancer genome. Finally, we experimentally validated subclonal predictions using fluorescence in situ hybridization (FISH) and single-cell sequencing from an ovarian cancer patient sample, thereby recapitulating the key modeling assumptions of TITAN.


Subject(s)
Algorithms , Computational Biology/methods , DNA Copy Number Variations , Models, Genetic , Neoplasms/genetics , Clone Cells/metabolism , Clone Cells/pathology , Female , Genomics/methods , Genotype , Humans , In Situ Hybridization, Fluorescence/methods , Loss of Heterozygosity , Ovarian Neoplasms/genetics , Polymorphism, Single Nucleotide , Reproducibility of Results , Sequence Analysis, DNA/methods , Triple Negative Breast Neoplasms/genetics
6.
Nat Methods ; 11(4): 396-8, 2014 Apr.
Article in English | MEDLINE | ID: mdl-24633410

ABSTRACT

We introduce PyClone, a statistical model for inference of clonal population structures in cancers. PyClone is a Bayesian clustering method for grouping sets of deeply sequenced somatic mutations into putative clonal clusters while estimating their cellular prevalences and accounting for allelic imbalances introduced by segmental copy-number changes and normal-cell contamination. Single-cell sequencing validation demonstrates PyClone's accuracy.


Subject(s)
Bayes Theorem , Cluster Analysis , Models, Biological , Models, Statistical , Neoplasms/metabolism , Algorithms , Alleles , Animals , DNA Mutational Analysis/methods , Gene Expression Regulation, Neoplastic , Humans , Mutation , Reproducibility of Results , Software
7.
Nature ; 486(7403): 395-9, 2012 Apr 04.
Article in English | MEDLINE | ID: mdl-22495314

ABSTRACT

Primary triple-negative breast cancers (TNBCs), a tumour type defined by lack of oestrogen receptor, progesterone receptor and ERBB2 gene amplification, represent approximately 16% of all breast cancers. Here we show in 104 TNBC cases that at the time of diagnosis these cancers exhibit a wide and continuous spectrum of genomic evolution, with some having only a handful of coding somatic aberrations in a few pathways, whereas others contain hundreds of coding somatic mutations. High-throughput RNA sequencing (RNA-seq) revealed that only approximately 36% of mutations are expressed. Using deep re-sequencing measurements of allelic abundance for 2,414 somatic mutations, we determine for the first time-to our knowledge-in an epithelial tumour subtype, the relative abundance of clonal frequencies among cases representative of the population. We show that TNBCs vary widely in their clonal frequencies at the time of diagnosis, with the basal subtype of TNBC showing more variation than non-basal TNBC. Although p53 (also known as TP53), PIK3CA and PTEN somatic mutations seem to be clonally dominant compared to other genes, in some tumours their clonal frequencies are incompatible with founder status. Mutations in cytoskeletal, cell shape and motility proteins occurred at lower clonal frequencies, suggesting that they occurred later during tumour progression. Taken together, our results show that understanding the biology and therapeutic responses of patients with TNBC will require the determination of individual tumour clonal genotypes.


Subject(s)
Breast Neoplasms/genetics , Breast Neoplasms/pathology , Evolution, Molecular , Mutation/genetics , Alleles , Breast Neoplasms/diagnosis , Clone Cells/metabolism , Clone Cells/pathology , DNA Copy Number Variations/genetics , DNA Mutational Analysis , Disease Progression , Female , Gene Expression Profiling , Gene Expression Regulation, Neoplastic/genetics , Genotype , High-Throughput Nucleotide Sequencing , Humans , INDEL Mutation/genetics , Point Mutation/genetics , Precision Medicine , Reproducibility of Results , Sequence Analysis, RNA
8.
Nature ; 461(7265): 809-13, 2009 Oct 08.
Article in English | MEDLINE | ID: mdl-19812674

ABSTRACT

Recent advances in next generation sequencing have made it possible to precisely characterize all somatic coding mutations that occur during the development and progression of individual cancers. Here we used these approaches to sequence the genomes (>43-fold coverage) and transcriptomes of an oestrogen-receptor-alpha-positive metastatic lobular breast cancer at depth. We found 32 somatic non-synonymous coding mutations present in the metastasis, and measured the frequency of these somatic mutations in DNA from the primary tumour of the same patient, which arose 9 years earlier. Five of the 32 mutations (in ABCB11, HAUS3, SLC24A4, SNX4 and PALB2) were prevalent in the DNA of the primary tumour removed at diagnosis 9 years earlier, six (in KIF1C, USP28, MYH8, MORC1, KIAA1468 and RNASEH2A) were present at lower frequencies (1-13%), 19 were not detected in the primary tumour, and two were undetermined. The combined analysis of genome and transcriptome data revealed two new RNA-editing events that recode the amino acid sequence of SRP9 and COG3. Taken together, our data show that single nucleotide mutational heterogeneity can be a property of low or intermediate grade primary breast cancers and that significant evolution can occur with disease progression.


Subject(s)
Breast Neoplasms/genetics , Breast Neoplasms/pathology , Genes, Neoplasm/genetics , Mutagenesis/genetics , Mutation/genetics , Nucleotides/genetics , Adaptor Proteins, Vesicular Transport/genetics , Breast Neoplasms/metabolism , DNA Mutational Analysis , Disease Progression , Estrogen Receptor alpha/metabolism , Evolution, Molecular , Gene Expression Profiling , Gene Expression Regulation, Neoplastic , Genome, Human/genetics , Germ-Line Mutation/genetics , Humans , Neoplasm Metastasis , RNA Editing/genetics , Signal Recognition Particle/genetics , Time Factors
9.
PLoS Genet ; 5(6): e1000537, 2009 Jun.
Article in English | MEDLINE | ID: mdl-19557190

ABSTRACT

A crucial step in the development of muscle cells in all metazoan animals is the assembly and anchorage of the sarcomere, the essential repeat unit responsible for muscle contraction. In Caenorhabditis elegans, many of the critical proteins involved in this process have been uncovered through mutational screens focusing on uncoordinated movement and embryonic arrest phenotypes. We propose that additional sarcomeric proteins exist for which there is a less severe, or entirely different, mutant phenotype produced in their absence. We have used Serial Analysis of Gene Expression (SAGE) to generate a comprehensive profile of late embryonic muscle gene expression. We generated two replicate long SAGE libraries for sorted embryonic muscle cells, identifying 7,974 protein-coding genes. A refined list of 3,577 genes expressed in muscle cells was compiled from the overlap between our SAGE data and available microarray data. Using the genes in our refined list, we have performed two separate RNA interference (RNAi) screens to identify novel genes that play a role in sarcomere assembly and/or maintenance in either embryonic or adult muscle. To identify muscle defects in embryos, we screened specifically for the Pat embryonic arrest phenotype. To visualize muscle defects in adult animals, we fed dsRNA to worms producing a GFP-tagged myosin protein, thus allowing us to analyze their myofilament organization under gene knockdown conditions using fluorescence microscopy. By eliminating or severely reducing the expression of 3,300 genes using RNAi, we identified 122 genes necessary for proper myofilament organization, 108 of which are genes without a previously characterized role in muscle. Many of the genes affecting sarcomere integrity have human homologs for which little or nothing is known.


Subject(s)
Actin Cytoskeleton/chemistry , Caenorhabditis elegans/genetics , Gene Expression Profiling/methods , Muscle Development , Actin Cytoskeleton/genetics , Actin Cytoskeleton/metabolism , Animals , Caenorhabditis elegans/chemistry , Caenorhabditis elegans/embryology , Caenorhabditis elegans/metabolism , Caenorhabditis elegans Proteins/genetics , Caenorhabditis elegans Proteins/metabolism , Gene Expression Regulation, Developmental , Muscles/chemistry , Muscles/embryology , Muscles/metabolism , Sarcomeres/genetics , Sarcomeres/metabolism
10.
N Engl J Med ; 360(26): 2719-29, 2009 Jun 25.
Article in English | MEDLINE | ID: mdl-19516027

ABSTRACT

BACKGROUND: Granulosa-cell tumors (GCTs) are the most common type of malignant ovarian sex cord-stromal tumor (SCST). The pathogenesis of these tumors is unknown. Moreover, their histopathological diagnosis can be challenging, and there is no curative treatment beyond surgery. METHODS: We analyzed four adult-type GCTs using whole-transcriptome paired-end RNA sequencing. We identified putative GCT-specific mutations that were present in at least three of these samples but were absent from the transcriptomes of 11 epithelial ovarian tumors, published human genomes, and databases of single-nucleotide polymorphisms. We confirmed these variants by direct sequencing of complementary DNA and genomic DNA. We then analyzed additional tumors and matched normal genomic DNA, using a combination of direct sequencing, analyses of restriction-fragment-length polymorphisms, and TaqMan assays. RESULTS: All four index GCTs had a missense point mutation, 402C-->G (C134W), in FOXL2, a gene encoding a transcription factor known to be critical for granulosa-cell development. The FOXL2 mutation was present in 86 of 89 additional adult-type GCTs (97%), in 3 of 14 thecomas (21%), and in 1 of 10 juvenile-type GCTs (10%). The mutation was absent in 49 SCSTs of other types and in 329 unrelated ovarian or breast tumors. CONCLUSIONS: Whole-transcriptome sequencing of four GCTs identified a single, recurrent somatic mutation (402C-->G) in FOXL2 that was present in almost all morphologically identified adult-type GCTs. Mutant FOXL2 is a potential driver in the pathogenesis of adult-type GCTs.


Subject(s)
Forkhead Transcription Factors/genetics , Granulosa Cell Tumor/genetics , Mutation, Missense , Ovarian Neoplasms/genetics , Base Sequence , Female , Forkhead Box Protein L2 , Gene Expression Profiling , Genetic Markers , Genotype , Granulosa Cell Tumor/diagnosis , Granulosa Cell Tumor/pathology , Humans , Immunohistochemistry , Ovarian Neoplasms/diagnosis , Ovarian Neoplasms/pathology , Point Mutation , Sequence Analysis, RNA , Taq Polymerase
11.
Dev Biol ; 327(2): 551-65, 2009 Mar 15.
Article in English | MEDLINE | ID: mdl-19111532

ABSTRACT

Starting with SAGE-libraries prepared from C. elegans FAC-sorted embryonic intestine cells (8E-16E cell stage), from total embryos and from purified oocytes, and taking advantage of the NextDB in situ hybridization data base, we define sets of genes highly expressed from the zygotic genome, and expressed either exclusively or preferentially in the embryonic intestine or in the intestine of newly hatched larvae; we had previously defined a similarly expressed set of genes from the adult intestine. We show that an extended TGATAA-like sequence is essentially the only candidate for a cis-acting regulatory motif common to intestine genes expressed at all stages. This sequence is a strong ELT-2 binding site and matches the sequence of GATA-like sites found to be important for the expression of every intestinal gene so far analyzed experimentally. We show that the majority of these three sets of highly expressed intestinal-specific/intestinal-enriched genes respond strongly to ectopic expression of ELT-2 within the embryo. By flow-sorting elt-2(null) larvae from elt-2(+) larvae and then preparing Solexa/Illumina-SAGE libraries, we show that the majority of these genes also respond strongly to loss-of-function of ELT-2. To test the consequences of loss of other transcription factors identified in the embryonic intestine, we develop a strain of worms that is RNAi-sensitive only in the intestine; however, we are unable (with one possible exception) to identify any other transcription factor whose intestinal loss-of-function causes a phenotype of comparable severity to the phenotype caused by loss of ELT-2. Overall, our results support a model in which ELT-2 is the predominant transcription factor in the post-specification C. elegans intestine and participates directly in the transcriptional regulation of the majority (>80%) of intestinal genes. We present evidence that ELT-2 plays a central role in most aspects of C. elegans intestinal physiology: establishing the structure of the enterocyte, regulating enzymes and transporters involved in digestion and nutrition, responding to environmental toxins and pathogenic infections, and regulating the downstream intestinal components of the daf-2/daf-16 pathway influencing aging and longevity.


Subject(s)
Caenorhabditis elegans Proteins/metabolism , Caenorhabditis elegans , GATA Transcription Factors/metabolism , Gene Expression Regulation, Developmental , Intestines/physiology , Animals , Base Sequence , Caenorhabditis elegans/anatomy & histology , Caenorhabditis elegans/embryology , Caenorhabditis elegans/growth & development , Caenorhabditis elegans Proteins/genetics , Computational Biology , GATA Transcription Factors/genetics , Intestines/anatomy & histology , Molecular Sequence Data , Phenotype , Promoter Regions, Genetic , RNA Interference , Recombinant Fusion Proteins/genetics , Recombinant Fusion Proteins/metabolism , Signal Transduction/physiology
12.
Genome Biol ; 8(6): R113, 2007.
Article in English | MEDLINE | ID: mdl-17570852

ABSTRACT

To facilitate discovery of novel human embryonic stem cell (ESC) transcripts, we generated 2.5 million LongSAGE tags from 9 human ESC lines. Analysis of this data revealed that ESCs express proportionately more RNA binding proteins compared with terminally differentiated cells, and identified novel ESC transcripts, at least one of which may represent a marker of the pluripotent state.


Subject(s)
Embryonic Stem Cells/metabolism , Gene Expression Profiling , Pluripotent Stem Cells/metabolism , Base Sequence , Cell Line , Humans , RNA-Binding Proteins/genetics , Sequence Alignment
13.
Stem Cells ; 25(7): 1681-9, 2007 Jul.
Article in English | MEDLINE | ID: mdl-17412892

ABSTRACT

Transcriptome profiling offers a powerful approach to investigating developmental processes. Long serial analysis of gene expression (LongSAGE) is particularly attractive for this purpose because of its inherent quantitative features and independence of both hybridization variables and prior knowledge of transcript identity. Here, we describe the validation and initial application of a modified protocol for amplifying cDNA preparations from <10 ng of RNA (<10(3) cells) to allow representative LongSAGE libraries to be constructed from rare stem cell-enriched populations. Quantitative real-time polymerase chain reaction (Q-RT-PCR) analyses and comparison of tag frequencies in replicate LongSAGE libraries produced from amplified and nonamplified cDNA preparations demonstrated preservation of the relative levels of different transcripts originally present at widely varying levels. This PCR-LongSAGE protocol was then used to obtain a 200,000-tag library from the CD34+ subset of normal adult human bone marrow cells. Analysis of this library revealed many anticipated transcripts, as well as transcripts not previously known to be present in CD34+ hematopoietic cells. The latter included numerous novel tags that mapped to unique and conserved sites in the human genome but not previously identified as transcribed elements in human cells. Q-RT-PCR was used to demonstrate that 10 of these novel tags were expressed in cDNA pools and present in extracts of other sources of normal human CD34+ hematopoietic cells. These findings illustrate the power of LongSAGE to identify new transcripts in stem cell-enriched populations and indicate the potential of this approach to be extended to other sources of rare cells. Disclosure of potential conflicts of interest is found at the end of this article.


Subject(s)
Antigens, CD34/metabolism , Bone Marrow Cells/metabolism , Gene Expression Profiling/methods , Polymerase Chain Reaction/methods , Adult , Cell Separation , DNA, Complementary/genetics , Gene Library , Humans , RNA, Messenger/genetics , Reproducibility of Results
14.
Genome Res ; 17(1): 108-16, 2007 Jan.
Article in English | MEDLINE | ID: mdl-17135571

ABSTRACT

We describe the details of a serial analysis of gene expression (SAGE) library construction and analysis platform that has enabled the generation of >298 high-quality SAGE libraries and >30 million SAGE tags primarily from sub-microgram amounts of total RNA purified from samples acquired by microdissection. Several RNA isolation methods were used to handle the diversity of samples processed, and various measures were applied to minimize ditag PCR carryover contamination. Modifications in the SAGE protocol resulted in improved cloning and DNA sequencing efficiencies. Bioinformatic measures to automatically assess DNA sequencing results were implemented to analyze the integrity of ditag structure, linker or cross-species ditag contamination, and yield of high-quality tags per sequence read. Our analysis of singleton tag errors resulted in a method for correcting such errors to statistically determine tag accuracy. From the libraries generated, we produced an essentially complete mapping of reliable 21-base-pair tags to the mouse reference genome sequence for a meta-library of approximately 5 million tags. Our analyses led us to reject the commonly held notion that duplicate ditags are artifacts. Rather than the usual practice of discarding such tags, we conclude that they should be retained to avoid introducing bias into the results and thereby maintain the quantitative nature of the data, which is a major theoretical advantage of SAGE as a tool for global transcriptional profiling.


Subject(s)
Gene Expression Profiling/methods , Gene Library , Animals , Caenorhabditis elegans/genetics , Cell Line , Cell Separation , Databases, Nucleic Acid , Embryonic Stem Cells/chemistry , Flow Cytometry , Genome , Humans , Mice , Microdissection , Sequence Analysis, DNA , Software , Zebrafish/genetics
15.
Dev Biol ; 302(2): 627-45, 2007 Feb 15.
Article in English | MEDLINE | ID: mdl-17113066

ABSTRACT

A SAGE library was prepared from hand-dissected intestines from adult Caenorhabditis elegans, allowing the identification of >4000 intestinally-expressed genes; this gene inventory provides fundamental information for understanding intestine function, structure and development. Intestinally-expressed genes fall into two broad classes: widely-expressed "housekeeping" genes and genes that are either intestine-specific or significantly intestine-enriched. Within this latter class of genes, we identified a subset of highly-expressed highly-validated genes that are expressed either exclusively or primarily in the intestine. Over half of the encoded proteins are candidates for secretion into the intestinal lumen to hydrolyze the bacterial food (e.g. lysozymes, amoebapores, lipases and especially proteases). The promoters of this subset of intestine-specific/intestine-enriched genes were analyzed computationally, using both a word-counting method (RSAT oligo-analysis) and a method based on Gibbs sampling (MotifSampler). Both methods returned the same over-represented site, namely an extended GATA-related sequence of the general form AHTGATAARR, which agrees with experimentally determined cis-acting control sequences found in intestine genes over the past 20 years. All promoters in the subset contain such a site, compared to <5% for control promoters; moreover, our analysis suggests that the majority (perhaps all) of genes expressed exclusively or primarily in the worm intestine are likely to contain such a site in their promoters. There are three zinc-finger GATA-type factors that are candidates to bind this extended GATA site in the differentiating C. elegans intestine: ELT-2, ELT-4 and ELT-7. All evidence points to ELT-2 being the most important of the three. We show that worms in which both the elt-4 and the elt-7 genes have been deleted from the genome are essentially wildtype, demonstrating that ELT-2 provides all essential GATA-factor functions in the intestine. The SAGE analysis also identifies more than a hundred other transcription factors in the adult intestine but few show an RNAi-induced loss-of-function phenotype and none (other than ELT-2) show a phenotype primarily in the intestine. We thus propose a simple model in which the ELT-2 GATA factor directly participates in the transcription of all intestine-specific/intestine-enriched genes, from the early embryo through to the dying adult. Other intestinal transcription factors would thus modulate the action of ELT-2, depending on the worm's nutritional and physiological needs.


Subject(s)
Caenorhabditis elegans Proteins/physiology , Caenorhabditis elegans/genetics , GATA Transcription Factors/physiology , Models, Genetic , Transcription, Genetic , Animals , Caenorhabditis elegans/metabolism , Caenorhabditis elegans Proteins/genetics , GATA Transcription Factors/genetics , Gene Expression Profiling , Intestinal Mucosa/metabolism , Promoter Regions, Genetic
16.
Genome Biol ; 7(12): R126, 2006.
Article in English | MEDLINE | ID: mdl-17187676

ABSTRACT

BACKGROUND: The recent availability of genome sequences of multiple related Caenorhabditis species has made it possible to identify, using comparative genomics, similarly transcribed genes in Caenorhabditis elegans and its sister species. Taking this approach, we have identified numerous novel ciliary genes in C. elegans, some of which may be orthologs of unidentified human ciliopathy genes. RESULTS: By screening for genes possessing canonical X-box sequences in promoters of three Caenorhabditis species, namely C. elegans, C. briggsae and C. remanei, we identified 93 genes (including known X-box regulated genes) that encode putative components of ciliated neurons in C. elegans and are subject to the same regulatory control. For many of these genes, restricted anatomical expression in ciliated cells was confirmed, and control of transcription by the ciliogenic DAF-19 RFX transcription factor was demonstrated by comparative transcriptional profiling of different tissue types and of daf-19(+) and daf-19(-) animals. Finally, we demonstrate that the dye-filling defect of dyf-5(mn400) animals, which is indicative of compromised exposure of cilia to the environment, is caused by a nonsense mutation in the serine/threonine protein kinase gene M04C9.5. CONCLUSION: Our comparative genomics-based predictions may be useful for identifying genes involved in human ciliopathies, including Bardet-Biedl Syndrome (BBS), since the C. elegans orthologs of known human BBS genes contain X-box motifs and are required for normal dye filling in C. elegans ciliated neurons.


Subject(s)
Caenorhabditis elegans/genetics , Cilia/metabolism , Genomics , Animals , Animals, Genetically Modified , Base Sequence , DNA Primers , Gene Expression Profiling , Genetic Complementation Test , Green Fluorescent Proteins/genetics , Oligonucleotide Array Sequence Analysis , Promoter Regions, Genetic
17.
Transgenic Res ; 15(6): 711-27, 2006 Dec.
Article in English | MEDLINE | ID: mdl-16952013

ABSTRACT

Currently, little information is available regarding the molecular organization of integrated transgenes in genetically-engineered fish. We performed a detailed structural analysis of an inserted transgene in one strain (M77) of transgenic coho salmon (Oncorhynchus kisutch) containing a salmon growth hormone gene construct (OnMTGH1). Microinjected DNA was found to have inserted into a single site in the coho salmon genome, and was organized with four complete internal copies and two partial terminal copies of the OnMTGH1 construct. All construct copies were organized in a direct-tandem (head-to-tail) repeat fashion in strain M77 and five additional strains (one also possessed a second recombinant junction fragment). For strain M77, the junctions between the transgene insert and the insertion point within the wild-type genome were cloned from strain-specific cosmid libraries and sequenced, revealing that the transgene insertion was accompanied by a deletion of 587 bp of wild-type DNA as well as a small insertion (19 bp) of unknown DNA upstream and a 14 bp direct- tandem duplication of sequence downstream. Upstream and downstream wild-type DNA sequence contained several repetitive sequence elements based on Southern blot analysis and homology to repetitive sequences in GenBank. In the downstream flank, a pseudogene sequence was also identified which has high homology to the CA membrane protein gene from Schistosoma japonicum, a parasite closely related to Sanguinicola sp. parasites which infect salmonids. Whether the presence of an inserted transgene and the presence of potentially horizontally-transmitted DNA are indicative of a genomic region with a predisposition for insertion of foreign DNA requires further study. The information derived from this transgene structure provides information useful for comparison to other transgenic organisms and for determination of the mechanism of transgene integration in lower vertebrates.


Subject(s)
DNA, Helminth/genetics , Gene Targeting/methods , Gene Transfer, Horizontal , Tandem Repeat Sequences , Transgenes/genetics , Animals , Genetic Engineering , Growth Hormone/genetics , Microinjections , Pseudogenes , Salmon , Schistosoma japonicum/genetics , Sequence Analysis, DNA , Sequence Deletion
18.
Proc Natl Acad Sci U S A ; 102(51): 18485-90, 2005 Dec 20.
Article in English | MEDLINE | ID: mdl-16352711

ABSTRACT

We analyzed 8.55 million LongSAGE tags generated from 72 libraries. Each LongSAGE library was prepared from a different mouse tissue. Analysis of the data revealed extensive overlap with existing gene data sets and evidence for the existence of approximately 24,000 previously undescribed genomic loci. The visual cortex, pancreas, mammary gland, preimplantation embryo, and placenta contain the largest number of differentially expressed transcripts, 25% of which are previously undescribed loci.


Subject(s)
Gene Expression Profiling , Gene Expression Regulation, Developmental/genetics , Mice, Inbred C57BL/genetics , Mice/genetics , Alternative Splicing/genetics , Animals , Multigene Family/genetics , RNA, Untranslated/genetics , Reproducibility of Results , Transcription, Genetic/genetics
19.
Curr Biol ; 15(10): 935-41, 2005 May 24.
Article in English | MEDLINE | ID: mdl-15916950

ABSTRACT

Cilia and flagella play important roles in many physiological processes, including cell and fluid movement, sensory perception, and development. The biogenesis and maintenance of cilia depend on intraflagellar transport (IFT), a motility process that operates bidirectionally along the ciliary axoneme. Disruption in IFT and cilia function causes several human disorders, including polycystic kidneys, retinal dystrophy, neurosensory impairment, and Bardet-Biedl syndrome (BBS). To uncover new ciliary components, including IFT proteins, we compared C. elegans ciliated neuronal and nonciliated cells through serial analysis of gene expression (SAGE) and screened for genes potentially regulated by the ciliogenic transcription factor, DAF-19. Using these complementary approaches, we identified numerous candidate ciliary genes and confirmed the ciliated-cell-specific expression of 14 novel genes. One of these, C27H5.7a, encodes a ciliary protein that undergoes IFT. As with other IFT proteins, its ciliary localization and transport is disrupted by mutations in IFT and bbs genes. Furthermore, we demonstrate that the ciliary structural defect of C. elegans dyf-13(mn396) mutants is caused by a mutation in C27H5.7a. Together, our findings help define a ciliary transcriptome and suggest that DYF-13, an evolutionarily conserved protein, is a novel core IFT component required for cilia function.


Subject(s)
Caenorhabditis elegans/genetics , Cilia/genetics , Gene Expression Profiling , Neurons/metabolism , Animals , Base Sequence , Caenorhabditis elegans Proteins/metabolism , Cilia/metabolism , Computational Biology , Genomics/methods , Green Fluorescent Proteins , Mutation/genetics , Protein Transport/physiology , Sequence Analysis, DNA , Transcription Factors/metabolism
20.
Genome Res ; 15(5): 603-15, 2005 May.
Article in English | MEDLINE | ID: mdl-15837805

ABSTRACT

We have identified longevity-associated genes in a long-lived Caenorhabditis elegans daf-2 (insulin/IGF receptor) mutant using serial analysis of gene expression (SAGE), a method that efficiently quantifies large numbers of mRNA transcripts by sequencing short tags. Reduction of daf-2 signaling in these mutant worms leads to a doubling in mean lifespan. We prepared C. elegans SAGE libraries from 1, 6, and 10-d-old adult daf-2 and from 1 and 6-d-old control adults. Differences in gene expression between daf-2 libraries representing different ages and between daf-2 versus control libraries identified not only single genes, but whole gene families that were differentially regulated. These gene families are part of major metabolic pathways including lipid, protein, and energy metabolism, stress response, and cell structure. Similar expression patterns of closely related family members emphasize the importance of these genes in aging-related processes. Global analysis of metabolism-associated genes showed hypometabolic features in mid-life daf-2 mutants that diminish with advanced age. Comparison of our results to recent microarray studies highlights sets of overlapping genes that are highly conserved throughout evolution and thus represent strong candidate genes that control aging and longevity.


Subject(s)
Caenorhabditis elegans Proteins/genetics , Caenorhabditis elegans/genetics , Gene Expression Profiling , Gene Expression Regulation , Longevity/genetics , Mutation/genetics , Receptor, Insulin/genetics , Age Factors , Animals , Caenorhabditis elegans/physiology , Energy Metabolism/genetics , Oligonucleotide Array Sequence Analysis , Signal Transduction/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...