Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 18 de 18
Filter
1.
Circulation ; 148(14): 1099-1112, 2023 10 03.
Article in English | MEDLINE | ID: mdl-37602409

ABSTRACT

BACKGROUND: Cardiac reprogramming is a technique to directly convert nonmyocytes into myocardial cells using genes or small molecules. This intervention provides functional benefit to the rodent heart when delivered at the time of myocardial infarction or activated transgenically up to 4 weeks after myocardial infarction. Yet, several hurdles have prevented the advancement of cardiac reprogramming for clinical use. METHODS: Through a combination of screening and rational design, we identified a cardiac reprogramming cocktail that can be encoded in a single adeno-associated virus. We also created a novel adeno-associated virus capsid that can transduce cardiac fibroblasts more efficiently than available parental serotypes by mutating posttranslationally modified capsid residues. Because a constitutive promoter was needed to drive high expression of these cell fate-altering reprogramming factors, we included binding sites to a cardiomyocyte-restricted microRNA within the 3' untranslated region of the expression cassette that limits expression to nonmyocytes. After optimizing this expression cassette to reprogram human cardiac fibroblasts into induced cardiomyocyte-like cells in vitro, we also tested the ability of this capsid/cassette combination to confer functional benefit in acute mouse myocardial infarction and chronic rat myocardial infarction models. RESULTS: We demonstrated sustained, dose-dependent improvement in cardiac function when treating a rat model 2 weeks after myocardial infarction, showing that cardiac reprogramming, when delivered in a single, clinically relevant adeno-associated virus vector, can support functional improvement in the postremodeled heart. This benefit was not observed with GFP (green fluorescent protein) or a hepatocyte reprogramming cocktail and was achieved even in the presence of immunosuppression, supporting myocyte formation as the underlying mechanism. CONCLUSIONS: Collectively, these results advance the application of cardiac reprogramming gene therapy as a viable therapeutic approach to treat chronic heart failure resulting from ischemic injury.


Subject(s)
MicroRNAs , Myocardial Infarction , Rats , Mice , Humans , Animals , Dependovirus/genetics , Myocytes, Cardiac/metabolism , Myocardial Infarction/therapy , Myocardial Infarction/drug therapy , MicroRNAs/genetics , MicroRNAs/metabolism , Genetic Therapy/methods , Green Fluorescent Proteins/genetics , Cellular Reprogramming , Fibroblasts/metabolism
2.
BMC Bioinformatics ; 19(1): 57, 2018 02 20.
Article in English | MEDLINE | ID: mdl-29463208

ABSTRACT

BACKGROUND: Prioritization of sequence variants for diagnosis and discovery of Mendelian diseases is challenging, especially in large collections of whole genome sequences (WGS). Fast, scalable solutions are needed for discovery research, for clinical applications, and for curation of massive public variant repositories such as dbSNP and gnomAD. In response, we have developed VVP, the VAAST Variant Prioritizer. VVP is ultrafast, scales to even the largest variant repositories and genome collections, and its outputs are designed to simplify clinical interpretation of variants of uncertain significance. RESULTS: We show that scoring the entire contents of dbSNP (> 155 million variants) requires only 95 min using a machine with 4 cpus and 16 GB of RAM, and that a 60X WGS can be processed in less than 5 min. We also demonstrate that VVP can score variants anywhere in the genome, regardless of type, effect, or location. It does so by integrating sequence conservation, the type of sequence change, allele frequencies, variant burden, and zygosity. Finally, we also show that VVP scores are consistently accurate, and easily interpreted, traits not shared by many commonly used tools such as SIFT and CADD. CONCLUSIONS: VVP provides rapid and scalable means to prioritize any sequence variant, anywhere in the genome, and its scores are designed to facilitate variant interpretation using ACMG and NHS guidelines. These traits make it well suited for operation on very large collections of WGS sequences.


Subject(s)
Computational Biology/methods , Genetic Variation , Genome, Human , Software , Databases, Genetic , Humans , Polymorphism, Single Nucleotide/genetics , ROC Curve , Time Factors , Whole Genome Sequencing , Zygote/metabolism
3.
Front Genet ; 5: 325, 2014.
Article in English | MEDLINE | ID: mdl-25278960

ABSTRACT

One of the challenges in the analysis of large data sets, particularly in a population-based setting, is the ability to perform comparisons across projects. This has to be done in such a way that the integrity of each individual project is maintained, while ensuring that the data are comparable across projects. These issues are beginning to be observed in human DNA methylation studies, as the Illumina 450k platform and next generation sequencing-based assays grow in popularity and decrease in price. This increase in productivity is enabling new insights into epigenetics, but also requires the development of pipelines and software capable of handling the large volumes of data. The specific problems inherent in creating a platform for the storage, comparison, integration, and visualization of DNA methylation data include data storage, algorithm efficiency and ability to interpret the results to derive biological meaning from them. Databases provide a ready-made solution to these issues, but as yet no tools exist that that leverage these advantages while providing an intuitive user interface for interpreting results in a genomic context. We have addressed this void by integrating a database to store DNA methylation data with a web interface to query and visualize the database and a set of libraries for more complex analysis. The resulting platform is called DaVIE: Database for the Visualization and Integration of Epigenetics data. DaVIE can use data culled from a variety of sources, and the web interface includes the ability to group samples by sub-type, compare multiple projects and visualize genomic features in relation to sites of interest. We have used DaVIE to identify patterns of DNA methylation in specific projects and across different projects, identify outlier samples, and cross-check differentially methylated CpG sites identified in specific projects across large numbers of samples. A demonstration server has been setup using GEO data at http://echelon.cmmt.ubc.ca/dbaccess/, with login "guest" and password "guest." Groups may download and install their own version of the server following the instructions on the project's wiki.

4.
Genome Biol ; 14(7): 126, 2013 Jul 29.
Article in English | MEDLINE | ID: mdl-23899167

ABSTRACT

A new study integrates genome-wide SNP genotyping, RNA-Seq and DNA methylation in human cells, revealing their relationships and posing new questions about causality.


Subject(s)
DNA Methylation , Gene Expression Regulation , Genetic Variation , Humans
5.
BMC Bioinformatics ; 14: 167, 2013 May 28.
Article in English | MEDLINE | ID: mdl-23714400

ABSTRACT

BACKGROUND: In the past decade, bioinformatics tools have matured enough to reliably perform sophisticated primary data analysis on Next Generation Sequencing (NGS) data, such as mapping, assemblies and variant calling, however, there is still a dire need for improvements in the higher level analysis such as NGS data organization, analysis of mutation patterns and Genome Wide Association Studies (GWAS). RESULTS: We present a high throughput pipeline for identifying cancer mutation targets, capable of processing billions of variations across thousands of samples. This pipeline is coupled with our Human Variation Database to provide more complex down stream analysis on the variations hosted in the database. Most notably, these analysis include finding significantly mutated regions across multiple genomes and regions with mutational preferences within certain types of cancers. The results of the analysis is presented in HTML summary reports that incorporate gene annotations from various resources for the reported regions. CONCLUSION: MuteProc is available for download through the Vancouver Short Read Analysis Package on Sourceforge: http://vancouvershortr.sourceforge.net. Instructions for use and a tutorial are provided on the accompanying wiki pages at https://sourceforge.net/apps/mediawiki/vancouvershortr/index.php?title=Pipeline_introduction.


Subject(s)
DNA Mutational Analysis/methods , Genes, Neoplasm , Mutation , Neoplasms/genetics , Software , High-Throughput Nucleotide Sequencing , Humans , Molecular Sequence Annotation
6.
Oncotarget ; 3(11): 1308-19, 2012 Nov.
Article in English | MEDLINE | ID: mdl-23131835

ABSTRACT

Somatic hypermutation (SHM) in the variable region of immunoglobulin genes (IGV) naturally occurs in a narrow window of B cell development to provide high-affinity antibodies. However, SHM can also aberrantly target proto-oncogenes and cause genome instability. The role of aberrant SHM (aSHM) has been widely studied in various non-Hodgkin's lymphoma particularly in diffuse large B-cell lymphoma (DLBCL). Although, it has been speculated that aSHM targets a wide range of genome loci so far only twelve genes have been identified as targets of aSHM through the targeted sequencing of selected genes. A genome-wide study aiming at identifying a comprehensive set of aSHM targets recurrently occurring in DLBCL has not been previously undertaken. Here, we present a comprehensive assessment of the somatic hypermutated genes in DLBCL identified through an analysis of genomic and transcriptome data derived from 40 DLBCL patients. Our analysis verifies that there are indeed many genes that are recurrently affected by aSHM. In particular, we have identified 32 novel targets that show same or higher level of aSHM activity than genes previously reported. Amongst these novel targets, 22 genes showed a significant correlation between mRNA abundance and aSHM.


Subject(s)
Lymphoma, Large B-Cell, Diffuse/genetics , Somatic Hypermutation, Immunoglobulin/genetics , Genome-Wide Association Study , Genomic Instability , Humans , Lymphoma, Large B-Cell, Diffuse/immunology , Mutation
7.
Am J Hum Genet ; 90(6): 1088-93, 2012 Jun 08.
Article in English | MEDLINE | ID: mdl-22578326

ABSTRACT

Autosomal-recessive inheritance, severe to profound sensorineural hearing loss, and partial agenesis of the corpus callosum are hallmarks of the clinically well-established Chudley-McCullough syndrome (CMS). Although not always reported in the literature, frontal polymicrogyria and gray matter heterotopia are uniformly present, whereas cerebellar dysplasia, ventriculomegaly, and arachnoid cysts are nearly invariant. Despite these striking brain malformations, individuals with CMS generally do not present with significant neurodevelopmental abnormalities, except for hearing loss. Homozygosity mapping and whole-exome sequencing of DNA from affected individuals in eight families (including the family in the first report of CMS) revealed four molecular variations (two single-base deletions, a nonsense mutation, and a canonical splice-site mutation) in the G protein-signaling modulator 2 gene, GPSM2, that underlie CMS. Mutations in GPSM2 have been previously identified in people with profound congenital nonsyndromic hearing loss (NSHL). Subsequent brain imaging of these individuals revealed frontal polymicrogyria, abnormal corpus callosum, and gray matter heterotopia, consistent with a CMS diagnosis, but no ventriculomegaly. The gene product, GPSM2, is required for orienting the mitotic spindle during cell division in multiple tissues, suggesting that the sensorineural hearing loss and characteristic brain malformations of CMS are due to defects in asymmetric cell divisions during development.


Subject(s)
Agenesis of Corpus Callosum/genetics , Arachnoid Cysts/genetics , Brain Diseases/genetics , Brain/abnormalities , Hearing Loss, Sensorineural/genetics , Intracellular Signaling Peptides and Proteins/genetics , Mutation , Adolescent , Adult , Agenesis of Corpus Callosum/pathology , Arachnoid Cysts/pathology , Brain/pathology , Child , Child, Preschool , Family Health , Female , Gene Deletion , Hearing Loss, Sensorineural/pathology , Homozygote , Humans , Infant , Male , Sequence Analysis, DNA
8.
Am J Hum Genet ; 90(1): 110-8, 2012 Jan 13.
Article in English | MEDLINE | ID: mdl-22177091

ABSTRACT

We used trio-based whole-exome sequencing to analyze two families affected by Weaver syndrome, including one of the original families reported in 1974. Filtering of rare variants in the affected probands against the parental variants identified two different de novo mutations in the enhancer of zeste homolog 2 (EZH2). Sanger sequencing of EZH2 in a third classically-affected proband identified a third de novo mutation in this gene. These data show that mutations in EZH2 cause Weaver syndrome.


Subject(s)
Abnormalities, Multiple/genetics , Congenital Hypothyroidism/genetics , Craniofacial Abnormalities/genetics , DNA-Binding Proteins/genetics , Hand Deformities, Congenital/genetics , Mutation , Transcription Factors/genetics , Adolescent , Adult , Base Sequence , Child , Child, Preschool , DNA Mutational Analysis , Enhancer of Zeste Homolog 2 Protein , Exome , Female , Humans , Infant , Male , Molecular Sequence Data , Pedigree , Polycomb Repressive Complex 2 , Young Adult
9.
N Engl J Med ; 366(3): 234-42, 2012 Jan 19.
Article in English | MEDLINE | ID: mdl-22187960

ABSTRACT

BACKGROUND: Germline truncating mutations in DICER1, an endoribonuclease in the RNase III family that is essential for processing microRNAs, have been observed in families with the pleuropulmonary blastoma-family tumor and dysplasia syndrome. Mutation carriers are at risk for nonepithelial ovarian tumors, notably sex cord-stromal tumors. METHODS: We sequenced the whole transcriptomes or exomes of 14 nonepithelial ovarian tumors and noted closely clustered mutations in the region of DICER1 encoding the RNase IIIb domain of DICER1 in four samples. We then sequenced this region of DICER1 in additional ovarian tumors and in certain other tumors and queried the effect of the mutations on the enzymatic activity of DICER1 using in vitro RNA cleavage assays. RESULTS: DICER1 mutations in the RNase IIIb domain were found in 30 of 102 nonepithelial ovarian tumors (29%), predominantly in Sertoli-Leydig cell tumors (26 of 43, or 60%), including 4 tumors with additional germline DICER1 mutations. These mutations were restricted to codons encoding metal-binding sites within the RNase IIIb catalytic centers, which are critical for microRNA interaction and cleavage, and were somatic in all 16 samples in which germline DNA was available for testing. We also detected mutations in 1 of 14 nonseminomatous testicular germ-cell tumors, in 2 of 5 embryonal rhabdomyosarcomas, and in 1 of 266 epithelial ovarian and endometrial carcinomas. The mutant DICER1 proteins had reduced RNase IIIb activity but retained RNase IIIa activity. CONCLUSIONS: Somatic missense mutations affecting the RNase IIIb domain of DICER1 are common in nonepithelial ovarian tumors. These mutations do not obliterate DICER1 function but alter it in specific cell types, a novel mechanism through which perturbation of microRNA processing may be oncogenic. (Funded by the Terry Fox Research Institute and others.).


Subject(s)
DEAD-box RNA Helicases/genetics , Mutation, Missense , Ovarian Neoplasms/genetics , Ribonuclease III/genetics , Sertoli-Leydig Cell Tumor/genetics , Carcinosarcoma/genetics , Female , Gene Expression , Gene Expression Profiling , Germ-Line Mutation , Humans , MicroRNAs/metabolism , Neoplasms, Germ Cell and Embryonal/genetics , Rhabdomyosarcoma/genetics , Sequence Analysis, DNA
10.
Bioinformatics ; 27(8): 1155-6, 2011 Apr 15.
Article in English | MEDLINE | ID: mdl-21367872

ABSTRACT

MOTIVATION: Current public variation databases are based upon collaboratively pooling data into a single database with a single interface available to the public. This gives little control to the collaborator to mine the database and requires that they freely share their data with the owners of the repository. We aim to provide an alternative mechanism: providing the source code and application programming interface (API) of a database, enabling researchers to set up local versions without investing heavily in the development of the resource and allowing for confidential information to remain secure. RESULTS: We describe an open-source database that can be installed easily at any research facility for the storage and analysis of thousands of next-generation sequencing variations. This database is built using PostgreSQL 8.4 (The PostgreSQL Global Development Group. postgres 8.4: http://www.postgresql.org) and provides a novel method for collating and searching across the reported results from thousands of next-generation sequence samples, as well as rapidly accessing vital information on the origin of the samples. The schema of the database makes rapid and insightful queries simple and enables easy annotation of novel or known genetic variations. A modular and cross-platform Java API is provided to perform common functions, such as generation of standard experimental reports and graphical summaries of modifications to genes. Included libraries allow adopters of the database to quickly develop their own queries. AVAILABILITY: The software is available for download through the Vancouver Short Read Analysis Package on Sourceforge, http://vancouvershortr.sourceforge.net. Instructions for use and deployment are provided on the accompanying wiki pages. CONTACT: afejes@bcgsc.ca.


Subject(s)
Databases, Nucleic Acid , Genetic Variation , Genome, Human , Genomics , Humans , Software
11.
BMC Res Notes ; 4: 34, 2011 Feb 08.
Article in English | MEDLINE | ID: mdl-21303547

ABSTRACT

BACKGROUND: A strong association between stress resistance and longevity in multicellular organisms has been established as many mutations that extend lifespan also show increased resistance to stress. AAK-2, the C. elegans homolog of an alpha subunit of AMP-activated protein kinase (AMPK) is an intracellular fuel sensor that regulates cellular energy homeostasis and functions in stress resistance and lifespan extension. FINDINGS: Here, we investigated global transcriptional responses of aak-2 mutants to oxidative stress and in turn identified potential downstream targets of AAK-2 involved in stress resistance in C. elegans. We employed massively parallel Illumina sequencing technology and performed comprehensive comparative transcriptome analysis. Specifically, we compared the transcriptomes of aak-2 and wild type animals under normal conditions and conditions of induced oxidative stress. This research has presented a snapshot of genome-wide transcriptional activities that take place in C. elegans in response to oxidative stress both in the presence and absence of AAK-2. CONCLUSIONS: The analysis presented in this study has enabled us to identify potential genes involved in stress resistance that may be either directly or indirectly under the control of AAK-2. Furthermore, we have extended our current knowledge of general defense responses of C. elegans against oxidative stress supporting the function for AAK-2 in inhibition of biosynthetic processes, especially lipid synthesis, under oxidative stress and transcriptional regulation of genes involved in reproductive processes.

12.
Genome Biol ; 11(8): R82, 2010.
Article in English | MEDLINE | ID: mdl-20696054

ABSTRACT

BACKGROUND: Adenocarcinomas of the tongue are rare and represent the minority (20 to 25%) of salivary gland tumors affecting the tongue. We investigated the utility of massively parallel sequencing to characterize an adenocarcinoma of the tongue, before and after treatment. RESULTS: In the pre-treatment tumor we identified 7,629 genes within regions of copy number gain. There were 1,078 genes that exhibited increased expression relative to the blood and unrelated tumors and four genes contained somatic protein-coding mutations. Our analysis suggested the tumor cells were driven by the RET oncogene. Genes whose protein products are targeted by the RET inhibitors sunitinib and sorafenib correlated with being amplified and or highly expressed. Consistent with our observations, administration of sunitinib was associated with stable disease lasting 4 months, after which the lung lesions began to grow. Administration of sorafenib and sulindac provided disease stabilization for an additional 3 months after which the cancer progressed and new lesions appeared. A recurring metastasis possessed 7,288 genes within copy number amplicons, 385 genes exhibiting increased expression relative to other tumors and 9 new somatic protein coding mutations. The observed mutations and amplifications were consistent with therapeutic resistance arising through activation of the MAPK and AKT pathways. CONCLUSIONS: We conclude that complete genomic characterization of a rare tumor has the potential to aid in clinical decision making and identifying therapeutic approaches where no established treatment protocols exist. These results also provide direct in vivo genomic evidence for mutational evolution within a tumor under drug selection and potential mechanisms of drug resistance accrual.


Subject(s)
Adenocarcinoma/genetics , Adenocarcinoma/pathology , Gene Expression Regulation, Neoplastic/drug effects , Protein Kinase Inhibitors/pharmacology , Proto-Oncogene Proteins c-ret/genetics , Adenocarcinoma/drug therapy , Benzenesulfonates/pharmacology , Benzenesulfonates/therapeutic use , Gene Dosage/drug effects , Genes, Neoplasm/drug effects , High-Throughput Nucleotide Sequencing , Humans , Indoles/pharmacology , Indoles/therapeutic use , Lung Neoplasms/secondary , Mitogen-Activated Protein Kinases/metabolism , Mutation , Neoplasm Proteins/genetics , Niacinamide/analogs & derivatives , Phenylurea Compounds , Protein Kinase Inhibitors/therapeutic use , Proto-Oncogene Proteins c-akt/metabolism , Proto-Oncogene Proteins c-ret/antagonists & inhibitors , Pyridines/pharmacology , Pyridines/therapeutic use , Pyrroles/pharmacology , Pyrroles/therapeutic use , Selection, Genetic , Sorafenib , Sunitinib , Tongue Neoplasms/drug therapy , Tongue Neoplasms/genetics , Tongue Neoplasms/pathology
13.
Nucleic Acids Res ; 38(11): e126, 2010 Jun.
Article in English | MEDLINE | ID: mdl-20375099

ABSTRACT

Dramatic progress in the development of next-generation sequencing technologies has enabled accurate genome-wide characterization of the binding sites of DNA-associated proteins. This technique, baptized as ChIP-Seq, uses a combination of chromatin immunoprecipitation and massively parallel DNA sequencing. Other published tools that predict binding sites from ChIP-Seq data use only positional information of mapped reads. In contrast, our algorithm MICSA (Motif Identification for ChIP-Seq Analysis) combines this source of positional information with information on motif occurrences to better predict binding sites of transcription factors (TFs). We proved the greater accuracy of MICSA with respect to several other tools by running them on datasets for the TFs NRSF, GABP, STAT1 and CTCF. We also applied MICSA on a dataset for the oncogenic TF EWS-FLI1. We discovered >2000 binding sites and two functionally different binding motifs. We observed that EWS-FLI1 can activate gene transcription when (i) its binding site is located in close proximity to the gene transcription start site (up to approximately 150 kb), and (ii) it contains a microsatellite sequence. Furthermore, we observed that sites without microsatellites can also induce regulation of gene expression--positively as often as negatively--and at much larger distances (up to approximately 1 Mb).


Subject(s)
Algorithms , Chromatin Immunoprecipitation/methods , Regulatory Elements, Transcriptional , Sequence Analysis, DNA , Transcription Factors/metabolism , Base Sequence , Binding Sites , Cell Line, Tumor , Consensus Sequence , Humans , Oncogene Proteins, Fusion/metabolism , Proto-Oncogene Protein c-fli-1/metabolism , RNA-Binding Protein EWS
14.
Genome Res ; 18(12): 1906-17, 2008 Dec.
Article in English | MEDLINE | ID: mdl-18787082

ABSTRACT

We characterized the relationship of H3K4me1 and H3K4me3 at distal and proximal regulatory elements by comparing ChIP-seq profiles for these histone modifications and for two functionally different transcription factors: STAT1 in the immortalized HeLa S3 cell line, with and without interferon-gamma (IFNG) stimulation; and FOXA2 in mouse adult liver tissue. In unstimulated and stimulated HeLa cells, respectively, we determined approximately 270,000 and approximately 301,000 H3K4me1-enriched regions, and approximately 54,500 and approximately 76,100 H3K4me3-enriched regions. In mouse adult liver, we determined approximately 227,000 and approximately 34,800 H3K4me1 and H3K4me3 regions. Seventy-five percent of the approximately 70,300 STAT1 binding sites in stimulated HeLa cells and 87% of the approximately 11,000 FOXA2 sites in mouse liver were distal to known gene TSS; in both cell types, approximately 83% of these distal sites were associated with at least one of the two histone modifications, and H3K4me1 was associated with over 96% of marked distal sites. After filtering against predicted transcription start sites, 50% of approximately 26,800 marked distal IFNG-stimulated STAT1 binding sites, but 95% of approximately 5800 marked distal FOXA2 sites, were associated with H3K4me1 only. Results for HeLa cells generated additional insights into transcriptional regulation involving STAT1. STAT1 binding was associated with 25% of all H3K4me1 regions in stimulated HeLa cells, suggesting that a single transcription factor can interact with an unexpectedly large fraction of regulatory regions. Strikingly, for a large majority of the locations of stimulated STAT1 binding, the dominant H3K4me1/me3 combinations were established before activation, suggesting mechanisms independent of IFNG stimulation and high-affinity STAT1 binding.


Subject(s)
Genome, Human , Hepatocyte Nuclear Factor 3-beta/metabolism , Histones/metabolism , Lysine/metabolism , Transcription Factors/metabolism , Animals , Base Sequence , Binding Sites/genetics , Cell Line, Transformed , Chromatin Immunoprecipitation , Female , Gene Expression Regulation , HeLa Cells , Hepatocyte Nuclear Factor 3-beta/genetics , Histones/genetics , Humans , Interferon-gamma/pharmacology , Lysine/genetics , Methylation , Mice , Mice, Inbred C57BL , Protein Binding/genetics , Regulatory Sequences, Nucleic Acid , STAT1 Transcription Factor/metabolism , Sequence Homology, Nucleic Acid , Transcription Factors/genetics
15.
Biotechniques ; 45(1): 81-94, 2008 Jul.
Article in English | MEDLINE | ID: mdl-18611170

ABSTRACT

Sequence-based methods for transcriptome characterization have typically relied on generation of either serial analysis of gene expression tags or expressed sequence tags. Although such approaches have the potential to enumerate transcripts by counting sequence tags derived from them, they typically do not robustly survey the majority of transcripts along their entire length. Here we show that massively parallel sequencing of randomly primed cDNAs, using a next-generation sequencing-by-synthesis technology, offers the potential to generate relative measures of mRNA and individual exon abundance while simultaneously profiling the prevalence of both annotated and novel exons and exon-splicing events. This technique identifies known single nucleotide polymorphisms (SNPs) as well as novel single-base variants. Analysis of these variants, and previously unannotated splicing events in the HeLa S3 cell line, reveals an overrepresentation of gene categories including those previously implicated in cancer.


Subject(s)
DNA, Complementary/genetics , Gene Expression Profiling , Exons , HeLa Cells , Humans , Polymorphism, Single Nucleotide , RNA Splicing , RNA, Messenger/analysis , Transcription Initiation Site
16.
Bioinformatics ; 24(15): 1729-30, 2008 Aug 01.
Article in English | MEDLINE | ID: mdl-18599518

ABSTRACT

SUMMARY: Next-generation sequencing can provide insight into protein-DNA association events on a genome-wide scale, and is being applied in an increasing number of applications in genomics and meta-genomics research. However, few software applications are available for interpreting these experiments. We present here an efficient application for use with chromatin-immunoprecipitation (ChIP-Seq) experimental data that includes novel functionality for identifying areas of gene enrichment and transcription factor binding site locations, as well as for estimating DNA fragment size distributions in enriched areas. The FindPeaks application can generate UCSC compatible custom 'WIG' track files from aligned-read files for short-read sequencing technology. The software application can be executed on any platform capable of running a Java Runtime Environment. Memory requirements are proportional to the number of sequencing reads analyzed; typically 4 GB permits processing of up to 40 million reads. AVAILABILITY: The FindPeaks 3.1 package and manual, containing algorithm descriptions, usage instructions and examples, are available at http://www.bcgsc.ca/platform/bioinfo/software/findpeaks Source files for FindPeaks 3.1 are available for academic use.


Subject(s)
Algorithms , Chromatin Immunoprecipitation/methods , Chromosome Mapping/methods , Pattern Recognition, Automated/methods , Sequence Analysis, DNA/methods , Software , Transcription Factors/genetics , Binding Sites
17.
J Mol Biol ; 336(3): 607-24, 2004 Feb 20.
Article in English | MEDLINE | ID: mdl-15095976

ABSTRACT

The function of many RNAs depends crucially on their structure. Therefore, the design of RNA molecules with specific structural properties has many potential applications, e.g. in the context of investigating the function of biological RNAs, of creating new ribozymes, or of designing artificial RNA nanostructures. Here, we present a new algorithm for solving the following RNA secondary structure design problem: given a secondary structure, find an RNA sequence (if any) that is predicted to fold to that structure. Unlike the (pseudoknot-free) secondary structure prediction problem, this problem appears to be hard computationally. Our new algorithm, "RNA Secondary Structure Designer (RNA-SSD)", is based on stochastic local search, a prominent general approach for solving hard combinatorial problems. A thorough empirical evaluation on computationally predicted structures of biological sequences and artificially generated RNA structures as well as on empirically modelled structures from the biological literature shows that RNA-SSD substantially out-performs the best known algorithm for this problem, RNAinverse from the Vienna RNA Package. In particular, the new algorithm is able to solve structures, consistently, for which RNAinverse is unable to find solutions. The RNA-SSD software is publically available under the name of RNA Designer at the RNASoft website (www.rnasoft.ca).


Subject(s)
Algorithms , Nucleic Acid Conformation , RNA/chemistry , Base Sequence , Computer Simulation , Databases, Genetic , Models, Genetic , Molecular Sequence Data
18.
Photosynth Res ; 78(3): 195-203, 2003.
Article in English | MEDLINE | ID: mdl-16245051

ABSTRACT

A proteomics approach was evaluated for analysis of photosyntheis-related proteins that are characteristic of chromatophores, particles derived from purple phototrophic bacterial intracytoplasmic membranes. Proteins of purified chromatophores from Rhodopseudomonas palustris were solubilized and digested with trypsin, to create a collection of peptides that were fractionated by liquid chromatography. Peptide sequences were determined and assigned to specific proteins by analysis of tandem mass spectra of peptides, and comparison to a library derived from the recently determined R. palustris genome sequence. A total of 300 proteins were detected with a probability value >/=0.9, and the number of proteins detected increased to 345 when the minimum probability value was reduced to 0.5. Membrane-integral proteins of the reaction center, cytochrome b/c (1), light-harvesting and ATPase complexes were used as controls to assess how well this approach performs with hydrophobic proteins. New genes were identified, and tentatively designated as encoding photosynthesis-related proteins. We conclude that this approach is a powerful method to evaluate the possible existence of new photosynthesis-related proteins (and genes), although alternative methods are needed to evaluate the exact functions of newly discovered genes.

SELECTION OF CITATIONS
SEARCH DETAIL
...