Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 15 de 15
Filter
Add more filters










Publication year range
1.
Nat Commun ; 8: 15178, 2017 05 05.
Article in English | MEDLINE | ID: mdl-28474669

ABSTRACT

CRISPR-Cas9 screens are powerful tools for high-throughput interrogation of genome function, but can be confounded by nuclease-induced toxicity at both on- and off-target sites, likely due to DNA damage. Here, to test potential solutions to this issue, we design and analyse a CRISPR-Cas9 library with 10 variable-length guides per gene and thousands of negative controls targeting non-functional, non-genic regions (termed safe-targeting guides), in addition to non-targeting controls. We find this library has excellent performance in identifying genes affecting growth and sensitivity to the ricin toxin. The safe-targeting guides allow for proper control of toxicity from on-target DNA damage. Using this toxicity as a proxy to measure off-target cutting, we demonstrate with tens of thousands of guides both the nucleotide position-dependent sensitivity to single mismatches and the reduction of off-target cutting using truncated guides. Our results demonstrate a simple strategy for high-throughput evaluation of target specificity and nuclease toxicity in Cas9 screens.


Subject(s)
CRISPR-Cas Systems/genetics , Gene Targeting/methods , Genomic Library , High-Throughput Screening Assays/methods , RNA, Guide, Kinetoplastida/genetics , Cell Line , Clustered Regularly Interspaced Short Palindromic Repeats/genetics , DNA Damage/genetics , Humans , Polysaccharides/biosynthesis , RNA Interference , Ricin/toxicity
2.
Nat Genet ; 48(2): 117-25, 2016 Feb.
Article in English | MEDLINE | ID: mdl-26691984

ABSTRACT

Cancer sequencing studies have primarily identified cancer driver genes by the accumulation of protein-altering mutations. An improved method would be annotation independent, sensitive to unknown distributions of functions within proteins and inclusive of noncoding drivers. We employed density-based clustering methods in 21 tumor types to detect variably sized significantly mutated regions (SMRs). SMRs reveal recurrent alterations across a spectrum of coding and noncoding elements, including transcription factor binding sites and untranslated regions mutated in up to ∼ 15% of specific tumor types. SMRs demonstrate spatial clustering of alterations in molecular domains and at interfaces, often with associated changes in signaling. Mutation frequencies in SMRs demonstrate that distinct protein regions are differentially mutated across tumor types, as exemplified by a linker region of PIK3CA in which biophysical simulations suggest that mutations affect regulatory interactions. The functional diversity of SMRs underscores both the varied mechanisms of oncogenic misregulation and the advantage of functionally agnostic driver identification.


Subject(s)
Mutation , Neoplasms/genetics , Humans
4.
Genome Res ; 25(11): 1610-21, 2015 Nov.
Article in English | MEDLINE | ID: mdl-26297486

ABSTRACT

Elucidating the consequences of genetic differences between humans is essential for understanding phenotypic diversity and personalized medicine. Although variation in RNA levels, transcription factor binding, and chromatin have been explored, little is known about global variation in translation and its genetic determinants. We used ribosome profiling, RNA sequencing, and mass spectrometry to perform an integrated analysis in lymphoblastoid cell lines from a diverse group of individuals. We find significant differences in RNA, translation, and protein levels suggesting diverse mechanisms of personalized gene expression control. Combined analysis of RNA expression and ribosome occupancy improves the identification of individual protein level differences. Finally, we identify genetic differences that specifically modulate ribosome occupancy--many of these differences lie close to start codons and upstream ORFs. Our results reveal a new level of gene expression variation among humans and indicate that genetic variants can cause changes in protein levels through effects on translation.


Subject(s)
Polymorphism, Single Nucleotide , Protein Biosynthesis , RNA/metabolism , Chromatin/genetics , Chromatin/metabolism , Gene Expression Profiling , Gene Expression Regulation , Humans , Proteomics , Quantitative Trait Loci , RNA, Messenger/genetics , RNA, Messenger/metabolism , Ribosomes/genetics , Ribosomes/metabolism , Sequence Alignment , Sequence Analysis, RNA
5.
Nature ; 512(7515): 400-5, 2014 Aug 28.
Article in English | MEDLINE | ID: mdl-25164749

ABSTRACT

Discovering the structure and dynamics of transcriptional regulatory events in the genome with cellular and temporal resolution is crucial to understanding the regulatory underpinnings of development and disease. We determined the genomic distribution of binding sites for 92 transcription factors and regulatory proteins across multiple stages of Caenorhabditis elegans development by performing 241 ChIP-seq (chromatin immunoprecipitation followed by sequencing) experiments. Integration of regulatory binding and cellular-resolution expression data produced a spatiotemporally resolved metazoan transcription factor binding map. Using this map, we explore developmental regulatory circuits that encode combinatorial logic at the levels of co-binding and co-expression of transcription factors, characterizing the genomic coverage and clustering of regulatory binding, the binding preferences of, and biological processes regulated by, transcription factors, the global transcription factor co-associations and genomic subdomains that suggest shared patterns of regulation, and identifying key transcription factors and transcription factor co-associations for fate specification of individual lineages and cell types.


Subject(s)
Caenorhabditis elegans/growth & development , Caenorhabditis elegans/genetics , Gene Expression Regulation, Developmental/genetics , Genome, Helminth/genetics , Spatio-Temporal Analysis , Transcription Factors/metabolism , Animals , Binding Sites , Caenorhabditis elegans/cytology , Caenorhabditis elegans/embryology , Caenorhabditis elegans Proteins/metabolism , Cell Lineage , Chromatin Immunoprecipitation , Genomics , Larva/cytology , Larva/genetics , Larva/growth & development , Larva/metabolism , Protein Binding
6.
Nature ; 512(7515): 453-6, 2014 Aug 28.
Article in English | MEDLINE | ID: mdl-25164757

ABSTRACT

Despite the large evolutionary distances between metazoan species, they can show remarkable commonalities in their biology, and this has helped to establish fly and worm as model organisms for human biology. Although studies of individual elements and factors have explored similarities in gene regulation, a large-scale comparative analysis of basic principles of transcriptional regulatory features is lacking. Here we map the genome-wide binding locations of 165 human, 93 worm and 52 fly transcription regulatory factors, generating a total of 1,019 data sets from diverse cell types, developmental stages, or conditions in the three species, of which 498 (48.9%) are presented here for the first time. We find that structural properties of regulatory networks are remarkably conserved and that orthologous regulatory factor families recognize similar binding motifs in vivo and show some similar co-associations. Our results suggest that gene-regulatory properties previously observed for individual factors are general principles of metazoan regulation that are remarkably well-preserved despite extensive functional divergence of individual network connections. The comparative maps of regulatory circuitry provided here will drive an improved understanding of the regulatory underpinnings of model organism biology and how these relate to human biology, development and disease.


Subject(s)
Caenorhabditis elegans/genetics , Drosophila melanogaster/genetics , Evolution, Molecular , Gene Expression Regulation/genetics , Gene Regulatory Networks/genetics , Transcription Factors/metabolism , Animals , Binding Sites , Caenorhabditis elegans/growth & development , Chromatin Immunoprecipitation , Conserved Sequence/genetics , Drosophila melanogaster/growth & development , Gene Expression Regulation, Developmental/genetics , Genome/genetics , Humans , Molecular Sequence Annotation , Nucleotide Motifs/genetics , Organ Specificity/genetics , Transcription Factors/genetics
7.
Bioinformatics ; 30(19): 2808-10, 2014 Oct.
Article in English | MEDLINE | ID: mdl-24903420

ABSTRACT

MOTIVATION: Interpretation and communication of genomic data require flexible and quantitative tools to analyze and visualize diverse data types, and yet, a comprehensive tool to display all common genomic data types in publication quality figures does not exist to date. To address this shortcoming, we present Sushi.R, an R/Bioconductor package that allows flexible integration of genomic visualizations into highly customizable, publication-ready, multi-panel figures from common genomic data formats including Browser Extensible Data (BED), bedGraph and Browser Extensible Data Paired-End (BEDPE). Sushi.R is open source and made publicly available through GitHub (https://github.com/dphansti/Sushi) and Bioconductor (http://bioconductor.org/packages/release/bioc/html/Sushi.html).


Subject(s)
Genomics , Software , Algorithms , Computational Biology/methods , Genome-Wide Association Study , Internet , Programming Languages
8.
Nat Biotechnol ; 32(6): 562-8, 2014 Jun.
Article in English | MEDLINE | ID: mdl-24727714

ABSTRACT

RNA-protein interactions drive fundamental biological processes and are targets for molecular engineering, yet quantitative and comprehensive understanding of the sequence determinants of affinity remains limited. Here we repurpose a high-throughput sequencing instrument to quantitatively measure binding and dissociation of a fluorescently labeled protein to >10(7) RNA targets generated on a flow cell surface by in situ transcription and intermolecular tethering of RNA to DNA. Studying the MS2 coat protein, we decompose the binding energy contributions from primary and secondary RNA structure, and observe that differences in affinity are often driven by sequence-specific changes in both association and dissociation rates. By analyzing the biophysical constraints and modeling mutational paths describing the molecular evolution of MS2 from low- to high-affinity hairpins, we quantify widespread molecular epistasis and a long-hypothesized, structure-dependent preference for G:U base pairs over C:A intermediates in evolutionary trajectories. Our results suggest that quantitative analysis of RNA on a massively parallel array (RNA-MaP) provides generalizable insight into the biophysical basis and evolutionary consequences of sequence-function relationships.


Subject(s)
Evolution, Molecular , High-Throughput Nucleotide Sequencing/methods , Protein Interaction Mapping/methods , RNA-Binding Proteins/chemistry , RNA-Binding Proteins/physiology , RNA/chemistry , RNA/physiology , Animals , Binding Sites , Humans , Protein Binding
9.
Proc Natl Acad Sci U S A ; 109(42): 16858-63, 2012 Oct 16.
Article in English | MEDLINE | ID: mdl-23035249

ABSTRACT

The ability of a protein to carry out a given function results from fundamental physicochemical properties that include the protein's structure, mechanism of action, and thermodynamic stability. Traditional approaches to study these properties have typically required the direct measurement of the property of interest, oftentimes a laborious undertaking. Although protein properties can be probed by mutagenesis, this approach has been limited by its low throughput. Recent technological developments have enabled the rapid quantification of a protein's function, such as binding to a ligand, for numerous variants of that protein. Here, we measure the ability of 47,000 variants of a WW domain to bind to a peptide ligand and use these functional measurements to identify stabilizing mutations without directly assaying stability. Our approach is rooted in the well-established concept that protein function is closely related to stability. Protein function is generally reduced by destabilizing mutations, but this decrease can be rescued by stabilizing mutations. Based on this observation, we introduce partner potentiation, a metric that uses this rescue ability to identify stabilizing mutations, and identify 15 candidate stabilizing mutations in the WW domain. We tested six candidates by thermal denaturation and found two highly stabilizing mutations, one more stabilizing than any previously known mutation. Thus, physicochemical properties such as stability are latent within these large-scale protein functional data and can be revealed by systematic analysis. This approach should allow other protein properties to be discovered.


Subject(s)
Epistasis, Genetic/genetics , Models, Molecular , Protein Stability , High-Throughput Nucleotide Sequencing/methods , Mutation/genetics , Structure-Activity Relationship , Thermodynamics
10.
Bioinformatics ; 27(24): 3430-1, 2011 Dec 15.
Article in English | MEDLINE | ID: mdl-22006916

ABSTRACT

SUMMARY: Measuring the consequences of mutation in proteins is critical to understanding their function. These measurements are essential in such applications as protein engineering, drug development, protein design and genome sequence analysis. Recently, high-throughput sequencing has been coupled to assays of protein activity, enabling the analysis of large numbers of mutations in parallel. We present Enrich, a tool for analyzing such deep mutational scanning data. Enrich identifies all unique variants (mutants) of a protein in high-throughput sequencing datasets and can correct for sequencing errors using overlapping paired-end reads. Enrich uses the frequency of each variant before and after selection to calculate an enrichment ratio, which is used to estimate fitness. Enrich provides an interactive interface to guide users. It generates user-accessible output for downstream analyses as well as several visualizations of the effects of mutation on function, thereby allowing the user to rapidly quantify and comprehend sequence-function relationships. AVAILABILITY AND IMPLEMENTATION: Enrich is implemented in Python and is available under a FreeBSD license at http://depts.washington.edu/sfields/software/enrich/. Enrich includes detailed documentation as well as a small example dataset. CONTACT: dfowler@uw.edu; fields@uw.edu SUPPLEMENTARY INFORMATION: Supplementary data is available at Bioinformatics online.


Subject(s)
Proteins/genetics , Proteins/metabolism , Software , Computational Biology , DNA Mutational Analysis , Genome , High-Throughput Nucleotide Sequencing , Mutation , Proteins/chemistry
11.
Trends Biotechnol ; 29(9): 435-42, 2011 Sep.
Article in English | MEDLINE | ID: mdl-21561674

ABSTRACT

Analysis of protein mutants is an effective means to understand their function. Protein display is an approach that allows large numbers of mutants of a protein to be selected based on their activity, but only a handful with maximal activity have been traditionally identified for subsequent functional analysis. However, the recent application of high-throughput sequencing (HTS) to protein display and selection has enabled simultaneous assessment of the function of hundreds of thousands of mutants that span the activity range from high to low. Such deep mutational scanning approaches are rapid and inexpensive with the potential for broad utility. In this review, we discuss the emergence of deep mutational scanning, the challenges associated with its use and some of its exciting applications.


Subject(s)
High-Throughput Screening Assays , Mutagenesis, Site-Directed , Mutation , Proteins/genetics , Amino Acid Sequence , Base Sequence , Computational Biology , Molecular Sequence Data , Proteins/chemistry
12.
G3 (Bethesda) ; 1(7): 549-58, 2011 Dec.
Article in English | MEDLINE | ID: mdl-22384366

ABSTRACT

The assessment of transcriptional regulation requires a genome-wide survey of active RNA polymerases. Thus, we combined the nuclear run-on assay, which labels and captures nascent transcripts, with high-throughput DNA sequencing to examine transcriptional activity in exponentially growing Saccharomyces cerevisiae. Sequence read data from these nuclear run-on libraries revealed that transcriptional regulation in yeast occurs not only at the level of RNA polymerase recruitment to promoters but also at postrecruitment steps. Nascent synthesis signals are strongly enriched at TSS throughout the yeast genome, particularly at histone loci. Nascent transcripts reveal antisense transcription for more than 300 genes, with the read data providing support for the activity of distinct promoters driving transcription in opposite directions rather than bidirectional transcription from single promoters. By monitoring total RNA in parallel, we found that transcriptional activity accounts for 80% of the variance in transcript abundance. We computed RNA stabilities from nascent and steady-state transcripts for each gene and found that the most stable and unstable transcripts encode proteins whose functional roles are consistent with these stabilities. We also surveyed transcriptional activity after heat shock and found that most, but not all, heat shock-inducible genes increase their abundance by increasing their RNA synthesis. In summary, this study provides a genome-wide view of RNA polymerase activity in yeast, identifies regulatory steps in the synthesis of transcripts, and analyzes transcript stabilities.

13.
Nat Methods ; 7(9): 741-6, 2010 Sep.
Article in English | MEDLINE | ID: mdl-20711194

ABSTRACT

We present a large-scale approach to investigate the functional consequences of sequence variation in a protein. The approach entails the display of hundreds of thousands of protein variants, moderate selection for activity and high-throughput DNA sequencing to quantify the performance of each variant. Using this strategy, we tracked the performance of >600,000 variants of a human WW domain after three and six rounds of selection by phage display for binding to its peptide ligand. Binding properties of these variants defined a high-resolution map of mutational preference across the WW domain; each position had unique features that could not be captured by a few representative mutations. Our approach could be applied to many in vitro or in vivo protein assays, providing a general means for understanding how protein function relates to sequence.


Subject(s)
High-Throughput Screening Assays/methods , Protein Array Analysis/methods , Proteins/chemistry , Proteins/metabolism , DNA/genetics , Databases, Nucleic Acid , Humans , Peptide Library , Proteins/genetics , Sequence Analysis, DNA , Structure-Activity Relationship
14.
BMC Genomics ; 11: 88, 2010 Feb 03.
Article in English | MEDLINE | ID: mdl-20128923

ABSTRACT

BACKGROUND: Experimental evolution of microbial populations provides a unique opportunity to study evolutionary adaptation in response to controlled selective pressures. However, until recently it has been difficult to identify the precise genetic changes underlying adaptation at a genome-wide scale. New DNA sequencing technologies now allow the genome of parental and evolved strains of microorganisms to be rapidly determined. RESULTS: We sequenced >93.5% of the genome of a laboratory-evolved strain of the yeast Saccharomyces cerevisiae and its ancestor at >28x depth. Both single nucleotide polymorphisms and copy number amplifications were found, with specific gains over array-based methodologies previously used to analyze these genomes. Applying a segmentation algorithm to quantify structural changes, we determined the approximate genomic boundaries of a 5x gene amplification. These boundaries guided the recovery of breakpoint sequences, which provide insights into the nature of a complex genomic rearrangement. CONCLUSIONS: This study suggests that whole-genome sequencing can provide a rapid approach to uncover the genetic basis of evolutionary adaptations, with further applications in the study of laboratory selections and mutagenesis screens. In addition, we show how single-end, short read sequencing data can provide detailed information about structural rearrangements, and generate predictions about the genomic features and processes that underlie genome plasticity.


Subject(s)
Genome, Fungal , Saccharomyces cerevisiae/genetics , Sequence Analysis, DNA/methods , Algorithms , Chromosome Breakpoints , DNA, Fungal/genetics , Evolution, Molecular , Gene Dosage , Genomic Library , Point Mutation , Polymorphism, Single Nucleotide
15.
BMC Bioinformatics ; 7: 275, 2006 Jun 01.
Article in English | MEDLINE | ID: mdl-16740163

ABSTRACT

BACKGROUND: The invariant lineage of the nematode Caenorhabditis elegans has potential as a powerful tool for the description of mutant phenotypes and gene expression patterns. We previously described procedures for the imaging and automatic extraction of the cell lineage from C. elegans embryos. That method uses time-lapse confocal imaging of a strain expressing histone-GFP fusions and a software package, StarryNite, processes the thousands of images and produces output files that describe the location and lineage relationship of each nucleus at each time point. RESULTS: We have developed a companion software package, AceTree, which links the images and the annotations using tree representations of the lineage. This facilitates curation and editing of the lineage. AceTree also contains powerful visualization and interpretive tools, such as space filling models and tree-based expression patterning, that can be used to extract biological significance from the data. CONCLUSION: By pairing a fast lineaging program written in C with a user interface program written in Java we have produced a powerful software suite for exploring embryonic development.


Subject(s)
Caenorhabditis elegans/embryology , Computational Biology/methods , Embryo, Nonmammalian/pathology , Animals , Cell Lineage , Green Fluorescent Proteins/chemistry , Histones/chemistry , Mutation , Phenotype , Phylogeny , Programming Languages , Software , User-Computer Interface
SELECTION OF CITATIONS
SEARCH DETAIL
...