Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 23
Filter
Add more filters










Publication year range
1.
BMC Microbiol ; 23(1): 299, 2023 10 20.
Article in English | MEDLINE | ID: mdl-37864136

ABSTRACT

The microbiota that colonize the human gut and other tissues are dynamic, varying both in composition and functional state between individuals and over time. Gene expression measurements can provide insights into microbiome composition and function. However, efficient and unbiased removal of microbial ribosomal RNA (rRNA) presents a barrier to acquiring metatranscriptomic data. Here we describe a probe set that achieves efficient enzymatic rRNA removal of complex human-associated microbial communities. We demonstrate that the custom probe set can be further refined through an iterative design process to efficiently deplete rRNA from a range of human microbiome samples. Using synthetic nucleic acid spike-ins, we show that the rRNA depletion process does not introduce substantial quantitative error in gene expression profiles. Successful rRNA depletion allows for efficient characterization of taxonomic and functional profiles, including during the development of the human gut microbiome. The pan-human microbiome enzymatic rRNA depletion probes described here provide a powerful tool for studying the transcriptional dynamics and function of the human microbiome.


Subject(s)
Gastrointestinal Microbiome , Microbiota , Humans , RNA, Ribosomal/genetics , Bacteria/genetics , RNA, Ribosomal, 16S/genetics , Microbiota/genetics , Gastrointestinal Microbiome/genetics
2.
Stem Cell Reports ; 8(4): 907-918, 2017 04 11.
Article in English | MEDLINE | ID: mdl-28343999

ABSTRACT

A defined protocol for efficiently deriving endothelial cells from human pluripotent stem cells was established and vascular morphogenesis was used as a model system to understand how synthetic hydrogels influence global biological function compared with common 2D and 3D culture platforms. RNA sequencing demonstrated that gene expression profiles were similar for endothelial cells and pericytes cocultured in polyethylene glycol (PEG) hydrogels or Matrigel, while monoculture comparisons identified distinct vascular signatures for each cell type. Endothelial cells cultured on tissue-culture polystyrene adopted a proliferative phenotype compared with cells cultured on or encapsulated in PEG hydrogels. The proliferative phenotype correlated to increased FAK-ERK activity, and knockdown or inhibition of ERK signaling reduced proliferation and expression for cell-cycle genes while increasing expression for "3D-like" vasculature development genes. Our results provide insight into the influence of 2D and 3D culture formats on global biological processes that regulate cell function.


Subject(s)
Endothelial Cells/cytology , Pericytes/cytology , Pluripotent Stem Cells/cytology , Tissue Engineering/methods , Transcriptome , Cell Culture Techniques/methods , Cell Cycle , Cell Differentiation , Cell Line , Cell Proliferation , Cells, Cultured , Collagen/chemistry , Drug Combinations , Endothelial Cells/metabolism , Humans , Hydrogels/chemistry , Laminin/chemistry , MAP Kinase Signaling System , Neovascularization, Physiologic , Pericytes/metabolism , Pluripotent Stem Cells/metabolism , Polyethylene Glycols/chemistry , Polystyrenes/chemistry , Proteoglycans/chemistry , Tissue Scaffolds/chemistry
3.
Nat Commun ; 7: 11306, 2016 06 27.
Article in English | MEDLINE | ID: mdl-27346250

ABSTRACT

The cost of whole-genome bisulfite sequencing (WGBS) remains a bottleneck for many studies and it is therefore imperative to extract as much information as possible from a given dataset. This is particularly important because even at the recommend 30X coverage for reference methylomes, up to 50% of high-resolution features such as differentially methylated positions (DMPs) cannot be called with current methods as determined by saturation analysis. To address this limitation, we have developed a tool that dynamically segments WGBS methylomes into blocks of comethylation (COMETs) from which lost information can be recovered in the form of differentially methylated COMETs (DMCs). Using this tool, we demonstrate recovery of ∼30% of the lost DMP information content as DMCs even at very low (5X) coverage. This constitutes twice the amount that can be recovered using an existing method based on differentially methylated regions (DMRs). In addition, we explored the relationship between COMETs and haplotypes in lymphoblastoid cell lines of African and European origin. Using best fit analysis, we show COMETs to be correlated in a population-specific manner, suggesting that this type of dynamic segmentation may be useful for integrated (epi)genome-wide association studies in the future.


Subject(s)
Computational Biology/methods , DNA Methylation , Genome, Human/genetics , Whole Genome Sequencing/methods , Algorithms , CpG Islands/genetics , Genotype , Haplotypes , Humans , Reproducibility of Results , Sulfites/chemistry
5.
Bioinformatics ; 29(8): 1035-43, 2013 Apr 15.
Article in English | MEDLINE | ID: mdl-23428641

ABSTRACT

MOTIVATION: Messenger RNA expression is important in normal development and differentiation, as well as in manifestation of disease. RNA-seq experiments allow for the identification of differentially expressed (DE) genes and their corresponding isoforms on a genome-wide scale. However, statistical methods are required to ensure that accurate identifications are made. A number of methods exist for identifying DE genes, but far fewer are available for identifying DE isoforms. When isoform DE is of interest, investigators often apply gene-level (count-based) methods directly to estimates of isoform counts. Doing so is not recommended. In short, estimating isoform expression is relatively straightforward for some groups of isoforms, but more challenging for others. This results in estimation uncertainty that varies across isoform groups. Count-based methods were not designed to accommodate this varying uncertainty, and consequently, application of them for isoform inference results in reduced power for some classes of isoforms and increased false discoveries for others. RESULTS: Taking advantage of the merits of empirical Bayesian methods, we have developed EBSeq for identifying DE isoforms in an RNA-seq experiment comparing two or more biological conditions. Results demonstrate substantially improved power and performance of EBSeq for identifying DE isoforms. EBSeq also proves to be a robust approach for identifying DE genes. AVAILABILITY AND IMPLEMENTATION: An R package containing examples and sample datasets is available at http://www.biostat.wisc.edu/kendzior/EBSEQ/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Gene Expression Profiling/methods , RNA Isoforms/metabolism , Sequence Analysis, RNA/methods , Bayes Theorem , Cell Line , Embryonic Stem Cells/metabolism , Genome , Models, Statistical , RNA, Messenger/metabolism , Software
6.
J Vis Exp ; (56): e3340, 2011 Oct 27.
Article in English | MEDLINE | ID: mdl-22064688

ABSTRACT

Whole transcriptome sequencing by mRNA-Seq is now used extensively to perform global gene expression, mutation, allele-specific expression and other genome-wide analyses. mRNA-Seq even opens the gate for gene expression analysis of non-sequenced genomes. mRNA-Seq offers high sensitivity, a large dynamic range and allows measurement of transcript copy numbers in a sample. Illumina's genome analyzer performs sequencing of a large number (> 10(7)) of relatively short sequence reads (< 150 bp).The "paired end" approach, wherein a single long read is sequenced at both its ends, allows for tracking alternate splice junctions, insertions and deletions, and is useful for de novo transcriptome assembly. One of the major challenges faced by researchers is a limited amount of starting material. For example, in experiments where cells are harvested by laser micro-dissection, available starting total RNA may measure in nanograms. Preparation of mRNA-Seq libraries from such samples have been described(1, 2) but involves significant PCR amplification that may introduce bias. Other RNA-Seq library construction procedures with minimal PCR amplification have been published(3, 4) but require microgram amounts of starting total RNA. Here we describe a protocol for the Illumina Genome Analyzer II platform for mRNA-Seq sequencing for library preparation that avoids significant PCR amplification and requires only 10 nanograms of total RNA. While this protocol has been described previously and validated for single-end sequencing(5), where it was shown to produce directional libraries without introducing significant amplification bias, here we validate it further for use as a paired end protocol. We selectively amplify polyadenylated messenger RNAs from starting total RNA using the T7 based Eberwine linear amplification method, coined "T7LA" (T7 linear amplification). The amplified poly-A mRNAs are fragmented, reverse transcribed and adapter ligated to produce the final sequencing library. For both single read and paired end runs, sequences are mapped to the human transcriptome(6) and normalized so that data from multiple runs can be compared. We report the gene expression measurement in units of transcripts per million (TPM), which is a superior measure to RPKM when comparing samples(7).


Subject(s)
Gene Expression Profiling/methods , RNA, Messenger/chemistry , Sequence Analysis, DNA/methods , Humans , Polymerase Chain Reaction/methods , RNA/chemistry , RNA/genetics , RNA, Messenger/genetics
7.
Nat Methods ; 8(10): 821-7, 2011 Sep 11.
Article in English | MEDLINE | ID: mdl-21983960

ABSTRACT

Combining high-mass-accuracy mass spectrometry, isobaric tagging and software for multiplexed, large-scale protein quantification, we report deep proteomic coverage of four human embryonic stem cell and four induced pluripotent stem cell lines in biological triplicate. This 24-sample comparison resulted in a very large set of identified proteins and phosphorylation sites in pluripotent cells. The statistical analysis afforded by our approach revealed subtle but reproducible differences in protein expression and protein phosphorylation between embryonic stem cells and induced pluripotent cells. Merging these results with RNA-seq analysis data, we found functionally related differences across each tier of regulation. We also introduce the Stem Cell-Omics Repository (SCOR), a resource to collate and display quantitative information across multiple planes of measurement, including mRNA, protein and post-translational modifications.


Subject(s)
Embryonic Stem Cells/metabolism , Induced Pluripotent Stem Cells/metabolism , Proteome/analysis , Proteomics , Humans , Proteome/metabolism
8.
Nat Methods ; 8(5): 424-9, 2011 May.
Article in English | MEDLINE | ID: mdl-21478862

ABSTRACT

We re-examine the individual components for human embryonic stem cell (ESC) and induced pluripotent stem cell (iPSC) culture and formulate a cell culture system in which all protein reagents for liquid media, attachment surfaces and splitting are chemically defined. A major improvement is the lack of a serum albumin component, as variations in either animal- or human-sourced albumin batches have previously plagued human ESC and iPSC culture with inconsistencies. Using this new medium (E8) and vitronectin-coated surfaces, we demonstrate improved derivation efficiencies of vector-free human iPSCs with an episomal approach. This simplified E8 medium should facilitate both the research use and clinical applications of human ESCs and iPSCs and their derivatives, and should be applicable to other reprogramming methods.


Subject(s)
Cell Culture Techniques/methods , Culture Media/chemistry , Induced Pluripotent Stem Cells/cytology , Animals , Biopsy , Cattle , Cell Proliferation , Cell Survival , Coated Materials, Biocompatible , Culture Media, Serum-Free/chemistry , Embryonic Stem Cells/cytology , Embryonic Stem Cells/metabolism , Fibroblasts/cytology , Gene Expression , Growth Substances , Humans , Induced Pluripotent Stem Cells/metabolism , Karyotyping , Serum Albumin, Bovine , Skin/cytology , Vitronectin
9.
Biotechniques ; 49(6): 898-904, 2010 Dec.
Article in English | MEDLINE | ID: mdl-21143212

ABSTRACT

Preparation of an Illumina sequencing library for gene expression analysis (mRNA-Seq) requires microgram amounts of starting total RNA or PCR-based amplification. Here we describe a protocol based on T7 linear RNA amplification that does not introduce significant bias, requires only 10 ng total RNA, and generates a directional, fully representative, whole-transcript mRNA-Seq Illumina library that is highly consistent across over three orders of magnitude of input RNA.


Subject(s)
Gene Library , Nucleic Acid Amplification Techniques/methods , RNA, Messenger/chemistry , RNA, Messenger/genetics , Cell Line , Embryonic Stem Cells/physiology , Gene Expression Profiling , Humans , RNA/chemistry , RNA/genetics , RNA, Messenger/isolation & purification , Sequence Analysis, RNA
10.
Cell Stem Cell ; 6(5): 479-91, 2010 May 07.
Article in English | MEDLINE | ID: mdl-20452322

ABSTRACT

Human embryonic stem cells (hESCs) share an identical genome with lineage-committed cells, yet possess the remarkable properties of self-renewal and pluripotency. The diverse cellular properties in different cells have been attributed to their distinct epigenomes, but how much epigenomes differ remains unclear. Here, we report that epigenomic landscapes in hESCs and lineage-committed cells are drastically different. By comparing the chromatin-modification profiles and DNA methylomes in hESCs and primary fibroblasts, we find that nearly one-third of the genome differs in chromatin structure. Most changes arise from dramatic redistributions of repressive H3K9me3 and H3K27me3 marks, which form blocks that significantly expand in fibroblasts. A large number of potential regulatory sequences also exhibit a high degree of dynamics in chromatin modifications and DNA methylation. Additionally, we observe novel, context-dependent relationships between DNA methylation and chromatin modifications. Our results provide new insights into epigenetic mechanisms underlying properties of pluripotency and cell fate commitment.


Subject(s)
Cell Lineage/genetics , Epigenesis, Genetic , Fibroblasts/cytology , Fibroblasts/metabolism , Genome, Human/genetics , Pluripotent Stem Cells/cytology , Pluripotent Stem Cells/metabolism , Cell Line , Chromatin/genetics , CpG Islands/genetics , DNA Methylation/genetics , Embryonic Stem Cells/cytology , Embryonic Stem Cells/metabolism , Genes, Developmental , Histones/metabolism , Humans , Lysine/metabolism , Protein Processing, Post-Translational , Regulatory Sequences, Nucleic Acid/genetics
11.
Bioinformatics ; 26(4): 493-500, 2010 Feb 15.
Article in English | MEDLINE | ID: mdl-20022975

ABSTRACT

MOTIVATION: RNA-Seq is a promising new technology for accurately measuring gene expression levels. Expression estimation with RNA-Seq requires the mapping of relatively short sequencing reads to a reference genome or transcript set. Because reads are generally shorter than transcripts from which they are derived, a single read may map to multiple genes and isoforms, complicating expression analyses. Previous computational methods either discard reads that map to multiple locations or allocate them to genes heuristically. RESULTS: We present a generative statistical model and associated inference methods that handle read mapping uncertainty in a principled manner. Through simulations parameterized by real RNA-Seq data, we show that our method is more accurate than previous methods. Our improved accuracy is the result of handling read mapping uncertainty with a statistical model and the estimation of gene expression levels as the sum of isoform expression levels. Unlike previous methods, our method is capable of modeling non-uniform read distributions. Simulations with our method indicate that a read length of 20-25 bases is optimal for gene-level expression estimation from mouse and maize RNA-Seq data when sequencing throughput is fixed.


Subject(s)
Gene Expression , Sequence Analysis, RNA/methods , Software , Algorithms , Animals , Base Sequence , Computational Biology/methods , Databases, Genetic , Gene Expression Profiling , Genome , Mice , Zea mays/genetics
12.
Nature ; 462(7271): 315-22, 2009 Nov 19.
Article in English | MEDLINE | ID: mdl-19829295

ABSTRACT

DNA cytosine methylation is a central epigenetic modification that has essential roles in cellular processes including genome regulation, development and disease. Here we present the first genome-wide, single-base-resolution maps of methylated cytosines in a mammalian genome, from both human embryonic stem cells and fetal fibroblasts, along with comparative analysis of messenger RNA and small RNA components of the transcriptome, several histone modifications, and sites of DNA-protein interaction for several key regulatory factors. Widespread differences were identified in the composition and patterning of cytosine methylation between the two genomes. Nearly one-quarter of all methylation identified in embryonic stem cells was in a non-CG context, suggesting that embryonic stem cells may use different methylation mechanisms to affect gene regulation. Methylation in non-CG contexts showed enrichment in gene bodies and depletion in protein binding sites and enhancers. Non-CG methylation disappeared upon induced differentiation of the embryonic stem cells, and was restored in induced pluripotent stem cells. We identified hundreds of differentially methylated regions proximal to genes involved in pluripotency and differentiation, and widespread reduced methylation levels in fibroblasts associated with lower transcriptional activity. These reference epigenomes provide a foundation for future studies exploring this key epigenetic modification in human disease and development.


Subject(s)
DNA Methylation , Epigenesis, Genetic , Genome/genetics , Cell Line , Cluster Analysis , DNA/metabolism , DNA-Binding Proteins/metabolism , Embryonic Stem Cells/metabolism , Humans
13.
Bioinformatics ; 25(11): 1424-5, 2009 Jun 01.
Article in English | MEDLINE | ID: mdl-19351619

ABSTRACT

SUMMARY: We have developed a tool, called ProbeMatch, for matching a large set of oligonucleotide sequences against a genome database using gapped alignments. Unlike most of the existing tools such as ELAND which only perform ungapped alignments allowing at most two mismatches, ProbeMatch generates both ungapped and gapped alignments allowing up to three errors including insertion, deletion and mismatch. To speedup sequence alignment, ProbeMatch uses gapped q-grams and q-grams of various patterns to identify target hits to a query sequence. This approach results in fewer initial sequences to examine with no loss in sensitivity. ProbeMatch has been used to align 169,095 Illumina GAII reads against the human genome, which could not be mapped by ELAND, and found alignments for 28,625 reads of the 169,095 reads in less than 3 h. AVAILABILITY: Source code is freely available at (http://www.cs.wisc.edu/~jignesh/probematch/).


Subject(s)
Genome/genetics , Genomics/methods , Oligonucleotides/chemistry , Sequence Alignment/methods , Software , Databases, Genetic , Sequence Analysis, DNA/methods
14.
Exp Hematol ; 36(10): 1377-89, 2008 Oct.
Article in English | MEDLINE | ID: mdl-18922365

ABSTRACT

OBJECTIVE: Cellular and molecular changes that occur during the genesis of the hematopoietic system and hematopoietic stem cells in the human embryo are mostly inaccessible to study and remain poorly understood. To address this gap we have exploited the human embryonic stem cell (hESC) system to molecularly characterize the global transcriptomes of the two functionally discreet and phenotypically separable populations of multipotent hematopoietic cells that first appear when hESCs are induced to differentiate on OP9 cells. MATERIALS AND METHODS: We prepared long serial analysis of gene expression libraries from lin-CD34+CD43+CD45- and lin-CD34+CD43+CD45+ subsets of primitive hematopoietic cells derived in vitro from hESCs, sequenced them to a depth of 200,000 tags and compared their content with similar libraries prepared from highly purified populations of very primitive human fetal liver and cord blood hematopoietic cells. RESULTS: Comparison of libraries obtained from hESC-derived lin-CD34+CD43+CD45- and lin-CD34+CD43+CD45+ revealed differences in their expression of genes associated with myeloid development, cellular biosynthetic processes, and cell-cycle regulation. Further comparisons with analogous data for primitive hematopoietic cells isolated from first-trimester human fetal liver and newborn cord blood showed an apparent similarity between the transcriptomes of the most primitive hESC- and in vivo-derived populations, with the main differences involving genes that regulate HSC self-renewal and homing, chromatin remodeling, AP1 transcription complex genes, and noncoding RNAs. CONCLUSION: These data suggest that primitive hematopoietic cells are generated from hESCs in vitro by processes similar to those operative during human embryogenesis in vivo, although some differences were also detected.


Subject(s)
Embryonic Stem Cells/cytology , Embryonic Stem Cells/physiology , Gene Expression Profiling , Hematopoietic Stem Cells/cytology , Hematopoietic Stem Cells/physiology , Antigens, CD/analysis , Antigens, CD/genetics , Cell Differentiation/physiology , Cell Division , Coculture Techniques , Computational Biology , Embryonic Development , Hematopoiesis/physiology , Humans , RNA/genetics , RNA/isolation & purification
15.
Nucleic Acids Res ; 36(9): 2926-38, 2008 May.
Article in English | MEDLINE | ID: mdl-18385155

ABSTRACT

Well-defined relationships between oligonucleotide properties and hybridization signal intensities (HSI) can aid chip design, data normalization and true biological knowledge discovery. We clarify these relationships using the data from two microarray experiments containing over three million probes from 48 high-density chips. We find that melting temperature (T(m)) has the most significant effect on HSI while length for the long oligonucleotides studied has very little effect. Analysis of positional effect using a linear model provides evidence that the protruding ends of probes contribute more than tethered ends to HSI, which is further validated by specifically designed match fragment sliding and extension experiments. The impact of sequence similarity (SeqS) on HSI is not significant in comparison with other oligonucleotide properties. Using regression and regression tree analysis, we prioritize these oligonucleotide properties based on their effects on HSI. The implications of our discoveries for the design of unbiased oligonucleotides are discussed. We propose that isothermal probes designed by varying the length is a viable strategy to reduce sequence bias, though imposing selection constraints on other oligonucleotide properties is also essential.


Subject(s)
Gene Expression Profiling , Oligonucleotide Array Sequence Analysis , Oligonucleotide Probes/chemistry , Humans , Nucleic Acid Conformation , Nucleic Acid Denaturation , Regression Analysis , Sequence Homology, Nucleic Acid , Temperature
16.
Science ; 318(5858): 1917-20, 2007 Dec 21.
Article in English | MEDLINE | ID: mdl-18029452

ABSTRACT

Somatic cell nuclear transfer allows trans-acting factors present in the mammalian oocyte to reprogram somatic cell nuclei to an undifferentiated state. We show that four factors (OCT4, SOX2, NANOG, and LIN28) are sufficient to reprogram human somatic cells to pluripotent stem cells that exhibit the essential characteristics of embryonic stem (ES) cells. These induced pluripotent human stem cells have normal karyotypes, express telomerase activity, express cell surface markers and genes that characterize human ES cells, and maintain the developmental potential to differentiate into advanced derivatives of all three primary germ layers. Such induced pluripotent human cell lines should be useful in the production of new disease models and in drug development, as well as for applications in transplantation medicine, once technical limitations (for example, mutation through viral integration) are eliminated.


Subject(s)
Cell Line , Cellular Reprogramming , Fibroblasts/cytology , Pluripotent Stem Cells/cytology , Animals , Cell Differentiation , Cell Proliferation , Cell Shape , DNA-Binding Proteins/genetics , DNA-Binding Proteins/physiology , Embryonic Stem Cells/cytology , Fetus , HMGB Proteins/genetics , HMGB Proteins/physiology , Homeodomain Proteins/genetics , Homeodomain Proteins/physiology , Humans , Infant, Newborn , Karyotyping , Mice , Mice, SCID , Nanog Homeobox Protein , Octamer Transcription Factor-3/genetics , Octamer Transcription Factor-3/physiology , Oligonucleotide Array Sequence Analysis , Pluripotent Stem Cells/physiology , RNA-Binding Proteins/genetics , RNA-Binding Proteins/physiology , SOXB1 Transcription Factors , Stem Cell Transplantation , Teratoma/pathology , Transcription Factors/genetics , Transcription Factors/physiology , Transduction, Genetic , Transgenes
17.
Cell Stem Cell ; 1(3): 299-312, 2007 Sep 13.
Article in English | MEDLINE | ID: mdl-18371364

ABSTRACT

We mapped Polycomb-associated H3K27 trimethylation (H3K27me3) and Trithorax-associated H3K4 trimethylation (H3K4me3) across the whole genome in human embryonic stem (ES) cells. The vast majority of H3K27me3 colocalized on genes modified with H3K4me3. These commodified genes displayed low expression levels and were enriched in developmental function. Another significant set of genes lacked both modifications and was also expressed at low levels in ES cells but was enriched for gene function in physiological responses rather than development. Commodified genes could change expression levels rapidly during differentiation, but so could a substantial number of genes in other modification categories. SOX2, POU5F1, and NANOG, pluripotency-associated genes, shifted from modification by H3K4me3 alone to colocalization of both modifications as they were repressed during differentiation. Our results demonstrate that H3K27me3 modifications change during early differentiation, both relieving existing repressive domains and imparting new ones, and that colocalization with H3K4me3 is not restricted to pluripotent cells.


Subject(s)
Embryonic Stem Cells/metabolism , Genome, Human/genetics , Histones/metabolism , Lysine/metabolism , Cell Differentiation , Cell Lineage , Embryonic Stem Cells/cytology , Gene Expression Regulation , Humans , Methylation , Promoter Regions, Genetic/genetics , Protein Transport
18.
Physiol Genomics ; 23(2): 246-56, 2005 Oct 17.
Article in English | MEDLINE | ID: mdl-16106031

ABSTRACT

The broad goal of physiological genomics research is to link genes to their functions using appropriate experimental and computational techniques. Modern genomics experiments enable the generation of vast quantities of data, and interpretation of this data requires the integration of information derived from many diverse sources. Computational biology and bioinformatics offer the ability to manage and channel this information torrent. The Rat Genome Database (RGD; http://rgd.mcw.edu) has developed computational tools and strategies specifically supporting the goal of linking genes to their functional roles in rat and, using comparative genomics, to human and mouse. We present an overview of the database with a focus on these unique computational tools and describe strategies for the use of these resources in the area of physiological genomics.


Subject(s)
Databases, Genetic , Genome/genetics , Genomics/methods , Rats/genetics , Rats/physiology , Animals , Cloning, Molecular , Gene Expression Profiling
19.
Nucleic Acids Res ; 33(Web Server issue): W376-81, 2005 Jul 01.
Article in English | MEDLINE | ID: mdl-15980493

ABSTRACT

One of the core activities of high-throughput proteomics is the identification of peptides from mass spectra. Some peptides can be identified using spectral matching programs like Sequest or Mascot, but many spectra do not produce high quality database matches. De novo peptide sequencing is an approach to determine partial peptide sequences for some of the unidentified spectra. A drawback of de novo peptide sequencing is that it produces a series of ordered and disordered sequence tags and mass tags rather than a complete, non-degenerate peptide amino acid sequence. This incomplete data is difficult to use in conventional search programs such as BLAST or FASTA. DeNovoID is a program that has been specifically designed to use degenerate amino acid sequence and mass data derived from MS experiments to search a peptide database. Since the algorithm employed depends on the amino acid composition of the peptide and not its sequence, DeNovoID does not have to consider all possible sequences, but rather a smaller number of compositions consistent with a spectrum. DeNovoID also uses a geometric indexing scheme that reduces the number of calculations required to determine the best peptide match in the database. DeNovoID is available at http://proteomics.mcw.edu/denovoid.


Subject(s)
Mass Spectrometry , Peptides/analysis , Proteomics/methods , Sequence Analysis, Protein/methods , Software , Algorithms , Amino Acids/analysis , Databases, Protein , Internet , Peptides/chemistry , User-Computer Interface
SELECTION OF CITATIONS
SEARCH DETAIL
...