Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 12 de 12
Filter
1.
Nat Methods ; 7(11): 909-12, 2010 Nov.
Article in English | MEDLINE | ID: mdl-20935650

ABSTRACT

We describe Trans-ABySS, a de novo short-read transcriptome assembly and analysis pipeline that addresses variation in local read densities by assembling read substrings with varying stringencies and then merging the resulting contigs before analysis. Analyzing 7.4 gigabases of 50-base-pair paired-end Illumina reads from an adult mouse liver poly(A) RNA library, we identified known, new and alternative structures in expressed transcripts, and achieved high sensitivity and specificity relative to reference-based assembly methods.


Subject(s)
Computational Biology/methods , Gene Expression Profiling , Sequence Analysis, DNA/methods , Animals , Mice
2.
Methods Mol Biol ; 650: 173-99, 2010.
Article in English | MEDLINE | ID: mdl-20686952

ABSTRACT

MicroRNAs are key regulators of gene expression in diverse biological processes and their importance in embryonic stem cells is indisputable. New 'next-generation' technologies such as Illumina massively parallel sequencing offer vast improvements, in both scale and sensitivity, to microRNA profiling studies. We describe a detailed procedure for the preparation of small RNA libraries for Illumina sequencing. We further comment on approaches for analyzing the resultant sequence data for measuring microRNA abundance.


Subject(s)
MicroRNAs/genetics , Sequence Analysis, RNA/methods , Embryonic Stem Cells/metabolism , Humans
3.
Curr Protoc Hum Genet ; Chapter 11: Unit 11.11.1-36, 2010 Apr.
Article in English | MEDLINE | ID: mdl-20373513

ABSTRACT

This unit provides a protocol for performing digital gene expression profiling on the Illumina Genome Analyzer sequencing platform. Tag sequencing (Tag-seq) is an implementation of the LongSAGE protocol on the Illumina sequencing platform that increases utility while reducing both the cost and time required to generate gene expression profiles. The ultra-high-throughput sequencing capability of the Illumina platform allows the cost-effective generation of libraries containing an average of 20 million tags, a 200-fold improvement over classical LongSAGE. Tag-seq has less sequence composition bias, leading to a better representation of AT-rich tag sequences, and allows a more accurate profiling of a subset of the transcriptome characterized by AT-rich genes expressed at levels below the threshold of detection of LongSAGE (Morrissy et al., 2009).


Subject(s)
Expressed Sequence Tags , Gene Expression Profiling/methods , Gene Library , Genomics/methods , RNA, Messenger/genetics , Sequence Analysis, DNA/methods , Polymerase Chain Reaction/methods
4.
Genome Res ; 18(11): 1787-97, 2008 Nov.
Article in English | MEDLINE | ID: mdl-18849523

ABSTRACT

MicroRNAs (miRNAs) have been shown to play important roles in physiological as well as multiple malignant processes, including acute myeloid leukemia (AML). In an effort to gain further insight into the role of miRNAs in AML, we have applied the Illumina massively parallel sequencing platform to carry out an in-depth analysis of the miRNA transcriptome in a murine leukemia progression model. This model simulates the stepwise conversion of a myeloid progenitor cell by an engineered overexpression of the nucleoporin 98 (NUP98)-homeobox HOXD13 fusion gene (ND13), to aggressive AML inducing cells upon transduction with the oncogenic collaborator Meis1. From this data set, we identified 307 miRNA/miRNA species in the ND13 cells and 306 miRNA/miRNA species in ND13+Meis1 cells, corresponding to 223 and 219 miRNA genes. Sequence counts varied between two and 136,558, indicating a remarkable expression range between the detected miRNA species. The large number of miRNAs expressed and the nature of differential expression suggest that leukemic progression as modeled here is dictated by the repertoire of shared, but differentially expressed miRNAs. Our finding of extensive sequence variations (isomiRs) for almost all miRNA and miRNA species adds additional complexity to the miRNA transcriptome. A stringent target prediction analysis coupled with in vitro target validation revealed the potential for miRNA-mediated release of oncogenes that facilitates leukemic progression from the preleukemic to leukemia inducing state. Finally, 55 novel miRNAs species were identified in our data set, adding further complexity to the emerging world of small RNAs.


Subject(s)
Gene Expression Profiling , Leukemia, Experimental/genetics , MicroRNAs/genetics , RNA, Neoplasm/genetics , Animals , Base Sequence , Cell Line, Tumor , Genetic Engineering , Genetic Variation , Homeodomain Proteins/genetics , Leukemia, Experimental/etiology , Leukemia, Myeloid, Acute/etiology , Leukemia, Myeloid, Acute/genetics , Mice , Models, Genetic , Myeloid Ecotropic Viral Integration Site 1 Protein , Neoplasm Proteins/genetics , Nuclear Pore Complex Proteins/genetics , Oncogene Proteins, Fusion/genetics , Transcription Factors/genetics
5.
Genome Res ; 18(4): 610-21, 2008 Apr.
Article in English | MEDLINE | ID: mdl-18285502

ABSTRACT

MicroRNAs (miRNAs) are emerging as important, albeit poorly characterized, regulators of biological processes. Key to further elucidation of their roles is the generation of more complete lists of their numbers and expression changes in different cell states. Here, we report a new method for surveying the expression of small RNAs, including microRNAs, using Illumina sequencing technology. We also present a set of methods for annotating sequences deriving from known miRNAs, identifying variability in mature miRNA sequences, and identifying sequences belonging to previously unidentified miRNA genes. Application of this approach to RNA from human embryonic stem cells obtained before and after their differentiation into embryoid bodies revealed the sequences and expression levels of 334 known plus 104 novel miRNA genes. One hundred seventy-one known and 23 novel microRNA sequences exhibited significant expression differences between these two developmental states. Owing to the increased number of sequence reads, these libraries represent the deepest miRNA sampling to date, spanning nearly six orders of magnitude of expression. The predicted targets of those miRNAs enriched in either sample shared common features. Included among the high-ranked predicted gene targets are those implicated in differentiation, cell cycle control, programmed cell death, and transcriptional regulation.


Subject(s)
Embryonic Stem Cells/metabolism , Gene Expression Profiling , MicroRNAs/chemistry , MicroRNAs/metabolism , Sequence Analysis, RNA/methods , Base Sequence , Gene Expression Regulation , Humans , MicroRNAs/genetics , Molecular Sequence Data , RNA Processing, Post-Transcriptional
6.
Genome Biol ; 8(6): R113, 2007.
Article in English | MEDLINE | ID: mdl-17570852

ABSTRACT

To facilitate discovery of novel human embryonic stem cell (ESC) transcripts, we generated 2.5 million LongSAGE tags from 9 human ESC lines. Analysis of this data revealed that ESCs express proportionately more RNA binding proteins compared with terminally differentiated cells, and identified novel ESC transcripts, at least one of which may represent a marker of the pluripotent state.


Subject(s)
Embryonic Stem Cells/metabolism , Gene Expression Profiling , Pluripotent Stem Cells/metabolism , Base Sequence , Cell Line , Humans , RNA-Binding Proteins/genetics , Sequence Alignment
7.
Plant J ; 50(6): 1063-78, 2007 Jun.
Article in English | MEDLINE | ID: mdl-17488239

ABSTRACT

As part of a larger project to sequence the Populus genome and generate genomic resources for this emerging model tree, we constructed a physical map of the Populus genome, representing one of the few such maps of an undomesticated, highly heterozygous plant species. The physical map, consisting of 2802 contigs, was constructed from fingerprinted bacterial artificial chromosome (BAC) clones. The map represents approximately 9.4-fold coverage of the Populus genome, which has been estimated from the genome sequence assembly to be 485 +/- 10 Mb in size. BAC ends were sequenced to assist long-range assembly of whole-genome shotgun sequence scaffolds and to anchor the physical map to the genome sequence. Simple sequence repeat-based markers were derived from the end sequences and used to initiate integration of the BAC and genetic maps. A total of 2411 physical map contigs, representing 97% of all clones assigned to contigs, were aligned to the sequence assembly (JGI Populus trichocarpa, version 1.0). These alignments represent a total coverage of 384 Mb (79%) of the entire poplar sequence assembly and 295 Mb (96%) of linkage group sequence assemblies. A striking result of the physical map contig alignments to the sequence assembly was the co-localization of multiple contigs across numerous regions of the 19 linkage groups. Targeted sequencing of BAC clones and genetic analysis in a small number of representative regions showed that these co-aligning contigs represent distinct haplotypes in the heterozygous individual sequenced, and revealed the nature of these haplotype sequence differences.


Subject(s)
Genome, Plant , Physical Chromosome Mapping , Populus/genetics , Chromosomes, Artificial, Bacterial , Haplotypes , Minisatellite Repeats , Polymorphism, Genetic , Sequence Alignment , Sequence Analysis, DNA
8.
Genome Res ; 17(1): 108-16, 2007 Jan.
Article in English | MEDLINE | ID: mdl-17135571

ABSTRACT

We describe the details of a serial analysis of gene expression (SAGE) library construction and analysis platform that has enabled the generation of >298 high-quality SAGE libraries and >30 million SAGE tags primarily from sub-microgram amounts of total RNA purified from samples acquired by microdissection. Several RNA isolation methods were used to handle the diversity of samples processed, and various measures were applied to minimize ditag PCR carryover contamination. Modifications in the SAGE protocol resulted in improved cloning and DNA sequencing efficiencies. Bioinformatic measures to automatically assess DNA sequencing results were implemented to analyze the integrity of ditag structure, linker or cross-species ditag contamination, and yield of high-quality tags per sequence read. Our analysis of singleton tag errors resulted in a method for correcting such errors to statistically determine tag accuracy. From the libraries generated, we produced an essentially complete mapping of reliable 21-base-pair tags to the mouse reference genome sequence for a meta-library of approximately 5 million tags. Our analyses led us to reject the commonly held notion that duplicate ditags are artifacts. Rather than the usual practice of discarding such tags, we conclude that they should be retained to avoid introducing bias into the results and thereby maintain the quantitative nature of the data, which is a major theoretical advantage of SAGE as a tool for global transcriptional profiling.


Subject(s)
Gene Expression Profiling/methods , Gene Library , Animals , Caenorhabditis elegans/genetics , Cell Line , Cell Separation , Databases, Nucleic Acid , Embryonic Stem Cells/chemistry , Flow Cytometry , Genome , Humans , Mice , Microdissection , Sequence Analysis, DNA , Software , Zebrafish/genetics
9.
Proc Natl Acad Sci U S A ; 102(51): 18485-90, 2005 Dec 20.
Article in English | MEDLINE | ID: mdl-16352711

ABSTRACT

We analyzed 8.55 million LongSAGE tags generated from 72 libraries. Each LongSAGE library was prepared from a different mouse tissue. Analysis of the data revealed extensive overlap with existing gene data sets and evidence for the existence of approximately 24,000 previously undescribed genomic loci. The visual cortex, pancreas, mammary gland, preimplantation embryo, and placenta contain the largest number of differentially expressed transcripts, 25% of which are previously undescribed loci.


Subject(s)
Gene Expression Profiling , Gene Expression Regulation, Developmental/genetics , Mice, Inbred C57BL/genetics , Mice/genetics , Alternative Splicing/genetics , Animals , Multigene Family/genetics , RNA, Untranslated/genetics , Reproducibility of Results , Transcription, Genetic/genetics
10.
Nucleic Acids Res ; 32(12): 3651-60, 2004.
Article in English | MEDLINE | ID: mdl-15247347

ABSTRACT

Using the human bacterial artificial chromosome (BAC) fingerprint-based physical map, genome sequence assembly and BAC end sequences, we have generated a fingerprint-validated set of 32 855 BAC clones spanning the human genome. The clone set provides coverage for at least 98% of the human fingerprint map, 99% of the current assembled sequence and has an effective resolving power of 79 kb. We have made the clone set publicly available, anticipating that it will generally facilitate FISH or array-CGH-based identification and characterization of chromosomal alterations relevant to disease.


Subject(s)
Chromosomes, Artificial, Bacterial , Genome, Human , Base Sequence , Cloning, Molecular , Humans , Physical Chromosome Mapping
11.
Genome Res ; 14(4): 766-79, 2004 Apr.
Article in English | MEDLINE | ID: mdl-15060021

ABSTRACT

As part of the effort to sequence the genome of Rattus norvegicus, we constructed a physical map comprised of fingerprinted bacterial artificial chromosome (BAC) clones from the CHORI-230 BAC library. These BAC clones provide approximately 13-fold redundant coverage of the genome and have been assembled into 376 fingerprint contigs. A yeast artificial chromosome (YAC) map was also constructed and aligned with the BAC map via fingerprinted BAC and P1 artificial chromosome clones (PACs) sharing interspersed repetitive sequence markers with the YAC-based physical map. We have annotated 95% of the fingerprint map clones in contigs with coordinates on the version 3.1 rat genome sequence assembly, using BAC-end sequences and in silico mapping methods. These coordinates have allowed anchoring 358 of the 376 fingerprint map contigs onto the sequence assembly. Of these, 324 contigs are anchored to rat genome sequences localized to chromosomes, and 34 contigs are anchored to unlocalized portions of the rat sequence assembly. The remaining 18 contigs, containing 54 clones, still require placement. The fingerprint map is a high-resolution integrative data resource that provides genome-ordered associations among BAC, YAC, and PAC clones and the assembled sequence of the rat genome.


Subject(s)
Chromosomes, Artificial, Bacterial/genetics , Chromosomes, Artificial, Yeast/genetics , Genome , Physical Chromosome Mapping/methods , Animals , Automation , Chromosomes/genetics , Cloning, Molecular/methods , Computational Biology/methods , Computational Biology/standards , Contig Mapping/methods , Contig Mapping/standards , DNA Fingerprinting/methods , DNA Fingerprinting/standards , Genetic Markers/genetics , Physical Chromosome Mapping/standards , Polymerase Chain Reaction/methods , Rats , Sequence Analysis, DNA/methods , Sequence Analysis, DNA/standards
12.
Nucleic Acids Res ; 30(11): 2460-8, 2002 Jun 01.
Article in English | MEDLINE | ID: mdl-12034834

ABSTRACT

We describe an efficient high-throughput method for accurate DNA sequencing of entire cDNA clones. Developed as part of our involvement in the Mammalian Gene Collection full-length cDNA sequencing initiative, the method has been used and refined in our laboratory since September 2000. Amenable to large scale projects, we have used the method to generate >7 Mb of accurate sequence from 3695 candidate full-length cDNAs. Sequencing is accomplished through the insertion of Mu transposon into cDNAs, followed by sequencing reactions primed with Mu-specific sequencing primers. Transposon insertion reactions are not performed with individual cDNAs but rather on pools of up to 96 clones. This pooling strategy reduces the number of transposon insertion sequencing libraries that would otherwise be required, reducing the costs and enhancing the efficiency of the transposon library construction procedure. Sequences generated using transposon-specific sequencing primers are assembled to yield the full-length cDNA sequence, with sequence editing and other sequence finishing activities performed as required to resolve sequence ambiguities. Although analysis of the many thousands (22 785) of sequenced Mu transposon insertion events revealed a weak sequence preference for Mu insertion, we observed insertion of the Mu transposon into 1015 of the possible 1024 5mer candidate insertion sites.


Subject(s)
Bacteriophage mu/genetics , DNA Transposable Elements/genetics , DNA, Complementary/genetics , Mutagenesis, Insertional/genetics , Recombination, Genetic/genetics , Sequence Analysis, DNA/methods , Base Composition , Cloning, Molecular , DNA Primers/genetics , Gene Library , Genetic Vectors/genetics , Monte Carlo Method , Physical Chromosome Mapping/methods , Sensitivity and Specificity , Sequence Analysis, DNA/economics , Substrate Specificity , Time Factors
SELECTION OF CITATIONS
SEARCH DETAIL
...