Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 23
Filter
1.
Science ; 376(6592): eabi8175, 2022 04 29.
Article in English | MEDLINE | ID: mdl-35482859

ABSTRACT

Establishing causal relationships between genetic alterations of human cancers and specific phenotypes of malignancy remains a challenge. We sequentially introduced mutations into healthy human melanocytes in up to five genes spanning six commonly disrupted melanoma pathways, forming nine genetically distinct cellular models of melanoma. We connected mutant melanocyte genotypes to malignant cell expression programs in vitro and in vivo, replicative immortality, malignancy, rapid tumor growth, pigmentation, metastasis, and histopathology. Mutations in malignant cells also affected tumor microenvironment composition and cell states. Our melanoma models shared genotype-associated expression programs with patient melanomas, and a deep learning model showed that these models partially recapitulated genotype-associated histopathological features as well. Thus, a progressive series of genome-edited human cancer models can causally connect genotypes carrying multiple mutations to phenotype.


Subject(s)
Melanoma , Skin Neoplasms , Humans , Melanocytes/metabolism , Melanoma/pathology , Mutation , Skin Neoplasms/genetics , Skin Neoplasms/pathology , Tumor Microenvironment/genetics
2.
Elife ; 82019 07 08.
Article in English | MEDLINE | ID: mdl-31282856

ABSTRACT

Identifying gene expression programs underlying both cell-type identity and cellular activities (e.g. life-cycle processes, responses to environmental cues) is crucial for understanding the organization of cells and tissues. Although single-cell RNA-Seq (scRNA-Seq) can quantify transcripts in individual cells, each cell's expression profile may be a mixture of both types of programs, making them difficult to disentangle. Here, we benchmark and enhance the use of matrix factorization to solve this problem. We show with simulations that a method we call consensus non-negative matrix factorization (cNMF) accurately infers identity and activity programs, including their relative contributions in each cell. To illustrate the insights this approach enables, we apply it to published brain organoid and visual cortex scRNA-Seq datasets; cNMF refines cell types and identifies both expected (e.g. cell cycle and hypoxia) and novel activity programs, including programs that may underlie a neurosecretory phenotype and synaptogenesis.


Subject(s)
Brain/metabolism , Gene Expression Profiling/methods , RNA-Seq/methods , Single-Cell Analysis/methods , Visual Cortex/metabolism , Algorithms , Animals , Brain/cytology , Computer Simulation , High-Throughput Nucleotide Sequencing/methods , Humans , Mice , Models, Genetic , Organoids/cytology , Organoids/metabolism , Reproducibility of Results , Visual Cortex/cytology
3.
PLoS One ; 12(6): e0178189, 2017.
Article in English | MEDLINE | ID: mdl-28594900

ABSTRACT

To further our understanding of the somatic genetic basis of uveal melanoma, we sequenced the protein-coding regions of 52 primary tumors and 3 liver metastases together with paired normal DNA. Known recurrent mutations were identified in GNAQ, GNA11, BAP1, EIF1AX, and SF3B1. The role of mutated EIF1AX was tested using loss of function approaches including viability and translational efficiency assays. Knockdown of both wild type and mutant EIF1AX was lethal to uveal melanoma cells. We probed the function of N-terminal tail EIF1AX mutations by performing RNA sequencing of polysome-associated transcripts in cells expressing endogenous wild type or mutant EIF1AX. Ribosome occupancy of the global translational apparatus was sensitive to suppression of wild type but not mutant EIF1AX. Together, these studies suggest that cells expressing mutant EIF1AX may exhibit aberrant translational regulation, which may provide clonal selective advantage in the subset of uveal melanoma that harbors this mutation.


Subject(s)
Genome, Human , Melanoma/genetics , Protein Biosynthesis/genetics , Uveal Neoplasms/genetics , Adult , Aged , Aged, 80 and over , Eukaryotic Initiation Factor-1/genetics , Female , Humans , Male , Melanoma/pathology , Middle Aged , Mutation , Uveal Neoplasms/pathology , Young Adult
4.
Nat Genet ; 48(8): 848-55, 2016 08.
Article in English | MEDLINE | ID: mdl-27348297

ABSTRACT

Recent studies have detailed the genomic landscape of primary endometrial cancers, but the evolution of these cancers into metastases has not been characterized. We performed whole-exome sequencing of 98 tumor biopsies including complex atypical hyperplasias, primary tumors and paired abdominopelvic metastases to survey the evolutionary landscape of endometrial cancer. We expanded and reanalyzed The Cancer Genome Atlas (TCGA) data, identifying new recurrent alterations in primary tumors, including mutations in the estrogen receptor cofactor gene NRIP1 in 12% of patients. We found that likely driver events were present in both primary and metastatic tissue samples, with notable exceptions such as ARID1A mutations. Phylogenetic analyses indicated that the sampled metastases typically arose from a common ancestral subclone that was not detected in the primary tumor biopsy. These data demonstrate extensive genetic heterogeneity in endometrial cancers and relative homogeneity across metastatic sites.


Subject(s)
Abdominal Neoplasms/genetics , Biomarkers, Tumor/genetics , Endometrial Hyperplasia/genetics , Endometrial Neoplasms/genetics , Evolution, Molecular , Mutation/genetics , Pelvic Neoplasms/genetics , Abdominal Neoplasms/secondary , Disease Progression , Endometrial Hyperplasia/pathology , Endometrial Neoplasms/pathology , Exome/genetics , Female , Genomics , High-Throughput Nucleotide Sequencing , Humans , Pelvic Neoplasms/secondary , Phylogeny
5.
Genome Res ; 24(12): 1991-9, 2014 Dec.
Article in English | MEDLINE | ID: mdl-25294245

ABSTRACT

Recent studies have shown a surprising phenomenon, whereby orthologous regulatory regions from different species drive similar expression levels despite being highly diverged in sequence. Here, we investigated this phenomenon by genomically integrating hundreds of ribosomal protein (RP) promoters from nine different yeast species into S. cerevisiae and accurately measuring their activity. We found that orthologous RP promoters have extreme expression conservation even across evolutionarily distinct yeast species. Notably, our measurements reveal two distinct mechanisms that underlie this conservation and which act in different regions of the promoter. In the core promoter region, we found compensatory changes, whereby effects of sequence variations in one part of the core promoter were reversed by variations in another part. In contrast, we observed robustness in Rap1 transcription factor binding sites, whereby significant sequence variations had little effect on promoter activity. Finally, cases in which orthologous promoter activities were not conserved could largely be explained by the sequence variation within the core promoter. Together, our results provide novel insights into the mechanisms by which expression is conserved throughout evolution across diverged promoter sequences.


Subject(s)
Promoter Regions, Genetic , Ribosomal Proteins/genetics , Saccharomyces cerevisiae/genetics , Binding Sites , Evolution, Molecular , Gene Expression Regulation, Fungal , Genetic Variation , Mutation , Protein Binding , Saccharomyces cerevisiae/metabolism , Transcription Factors/metabolism
6.
Nat Genet ; 46(12): 1264-6, 2014 Dec.
Article in English | MEDLINE | ID: mdl-25344691

ABSTRACT

We report somatic mutations of RNF43 in over 18% of colorectal adenocarcinomas and endometrial carcinomas. RNF43 encodes an E3 ubiquitin ligase that negatively regulates Wnt signaling. Truncating mutations of RNF43 are more prevalent in microsatellite-unstable tumors and show mutual exclusivity with inactivating APC mutations in colorectal adenocarcinomas. These results indicate that RNF43 is one of the most commonly mutated genes in colorectal and endometrial cancers.


Subject(s)
Adenocarcinoma/genetics , Colorectal Neoplasms/genetics , DNA-Binding Proteins/genetics , Endometrial Neoplasms/genetics , Mutation , Oncogene Proteins/genetics , Adenomatous Polyposis Coli Protein/genetics , Exome , Female , Gene Expression Regulation, Neoplastic , Humans , Microsatellite Repeats/genetics , Phenotype , Sequence Analysis, DNA , Signal Transduction , Ubiquitin-Protein Ligases
7.
Cancer Discov ; 4(5): 546-53, 2014 May.
Article in English | MEDLINE | ID: mdl-24625776

ABSTRACT

Understanding the genetic mechanisms of sensitivity to targeted anticancer therapies may improve patient selection, response to therapy, and rational treatment designs. One approach to increase this understanding involves detailed studies of exceptional responders: rare patients with unexpected exquisite sensitivity or durable responses to therapy. We identified an exceptional responder in a phase I study of pazopanib and everolimus in advanced solid tumors. Whole-exome sequencing of a patient with a 14-month complete response on this trial revealed two concurrent mutations in mTOR, the target of everolimus. In vitro experiments demonstrate that both mutations are activating, suggesting a biologic mechanism for exquisite sensitivity to everolimus in this patient. The use of precision (or "personalized") medicine approaches to screen patients with cancer for alterations in the mTOR pathway may help to identify subsets of patients who may benefit from targeted therapies directed against mTOR.


Subject(s)
Antineoplastic Agents/administration & dosage , Antineoplastic Combined Chemotherapy Protocols/administration & dosage , Carcinoma, Transitional Cell/drug therapy , Everolimus/administration & dosage , Pyrimidines/administration & dosage , Sulfonamides/administration & dosage , TOR Serine-Threonine Kinases/genetics , Urinary Bladder Neoplasms/drug therapy , Aged , Antineoplastic Agents/pharmacokinetics , Antineoplastic Agents/therapeutic use , Antineoplastic Combined Chemotherapy Protocols/therapeutic use , Carcinoma, Transitional Cell/genetics , Drug Administration Schedule , Everolimus/pharmacokinetics , Everolimus/therapeutic use , Female , High-Throughput Nucleotide Sequencing , Humans , Indazoles , Lymphatic Metastasis/diagnostic imaging , Lymphatic Metastasis/genetics , Male , Middle Aged , Mutation , Precision Medicine , Pyrimidines/pharmacokinetics , Pyrimidines/therapeutic use , Radionuclide Imaging , Sequence Analysis, DNA , Sulfonamides/pharmacokinetics , Sulfonamides/therapeutic use , TOR Serine-Threonine Kinases/antagonists & inhibitors , TOR Serine-Threonine Kinases/chemistry , Urinary Bladder Neoplasms/genetics
8.
Cancer Discov ; 4(1): 94-109, 2014 Jan.
Article in English | MEDLINE | ID: mdl-24265153

ABSTRACT

Most patients with BRAF(V600)-mutant metastatic melanoma develop resistance to selective RAF kinase inhibitors. The spectrum of clinical genetic resistance mechanisms to RAF inhibitors and options for salvage therapy are incompletely understood. We performed whole-exome sequencing on formalin-fixed, paraffin-embedded tumors from 45 patients with BRAF(V600)-mutant metastatic melanoma who received vemurafenib or dabrafenib monotherapy. Genetic alterations in known or putative RAF inhibitor resistance genes were observed in 23 of 45 patients (51%). Besides previously characterized alterations, we discovered a "long tail" of new mitogen-activated protein kinase (MAPK) pathway alterations (MAP2K2, MITF) that confer RAF inhibitor resistance. In three cases, multiple resistance gene alterations were observed within the same tumor biopsy. Overall, RAF inhibitor therapy leads to diverse clinical genetic resistance mechanisms, mostly involving MAPK pathway reactivation. Novel therapeutic combinations may be needed to achieve durable clinical control of BRAF(V600)-mutant melanoma. Integrating clinical genomics with preclinical screens may model subsequent resistance studies.


Subject(s)
Antineoplastic Agents/therapeutic use , Drug Resistance, Neoplasm/genetics , Melanoma/genetics , Protein Kinase Inhibitors/therapeutic use , Proto-Oncogene Proteins B-raf/antagonists & inhibitors , Skin Neoplasms/genetics , Cell Line, Tumor , Exome , Female , HEK293 Cells , Humans , Imidazoles/therapeutic use , Indoles/therapeutic use , MAP Kinase Kinase 1/genetics , MAP Kinase Kinase 2/genetics , Male , Melanoma/drug therapy , Middle Aged , Mutation , Neoplasm Metastasis , Oximes/therapeutic use , Phosphatidylinositol 3-Kinases/metabolism , Phosphoinositide-3 Kinase Inhibitors , Proto-Oncogene Proteins B-raf/genetics , Sequence Analysis, DNA , Skin Neoplasms/drug therapy , Sulfonamides/therapeutic use , Vemurafenib
9.
Cancer Discov ; 4(1): 61-8, 2014 Jan.
Article in English | MEDLINE | ID: mdl-24265154

ABSTRACT

Treatment of BRAF-mutant melanoma with combined dabrafenib and trametinib, which target RAF and the downstream MAP-ERK kinase (MEK)1 and MEK2 kinases, respectively, improves progression-free survival and response rates compared with dabrafenib monotherapy. Mechanisms of clinical resistance to combined RAF/MEK inhibition are unknown. We performed whole-exome sequencing (WES) and whole-transcriptome sequencing (RNA-seq) on pretreatment and drug-resistant tumors from five patients with acquired resistance to dabrafenib/trametinib. In three of these patients, we identified additional mitogen-activated protein kinase (MAPK) pathway alterations in the resistant tumor that were not detected in the pretreatment tumor, including a novel activating mutation in MEK2 (MEK2(Q60P)). MEK2(Q60P) conferred resistance to combined RAF/MEK inhibition in vitro, but remained sensitive to inhibition of the downstream kinase extracellular signal-regulated kinase (ERK). The continued MAPK signaling-based resistance identified in these patients suggests that alternative dosing of current agents, more potent RAF/MEK inhibitors, and/or inhibition of the downstream kinase ERK may be needed for durable control of BRAF-mutant melanoma.


Subject(s)
Drug Resistance, Neoplasm/physiology , Melanoma/genetics , Proto-Oncogene Proteins B-raf/genetics , Skin Neoplasms/genetics , Aged , Antineoplastic Agents/therapeutic use , Humans , Imidazoles/therapeutic use , Male , Melanoma/drug therapy , Melanoma/metabolism , Middle Aged , Mitogen-Activated Protein Kinase Kinases/antagonists & inhibitors , Mitogen-Activated Protein Kinase Kinases/genetics , Mitogen-Activated Protein Kinases/genetics , Mitogen-Activated Protein Kinases/metabolism , Mutation , Oximes/therapeutic use , Protein Kinase Inhibitors/therapeutic use , Proto-Oncogene Proteins B-raf/antagonists & inhibitors , Pyridones/therapeutic use , Pyrimidinones/therapeutic use , Signal Transduction , Skin Neoplasms/drug therapy , Skin Neoplasms/metabolism , raf Kinases/antagonists & inhibitors , raf Kinases/genetics
10.
Nature ; 499(7457): 214-218, 2013 Jul 11.
Article in English | MEDLINE | ID: mdl-23770567

ABSTRACT

Major international projects are underway that are aimed at creating a comprehensive catalogue of all the genes responsible for the initiation and progression of cancer. These studies involve the sequencing of matched tumour-normal samples followed by mathematical analysis to identify those genes in which mutations occur more frequently than expected by random chance. Here we describe a fundamental problem with cancer genome studies: as the sample size increases, the list of putatively significant genes produced by current analytical methods burgeons into the hundreds. The list includes many implausible genes (such as those encoding olfactory receptors and the muscle protein titin), suggesting extensive false-positive findings that overshadow true driver events. We show that this problem stems largely from mutational heterogeneity and provide a novel analytical methodology, MutSigCV, for resolving the problem. We apply MutSigCV to exome sequences from 3,083 tumour-normal pairs and discover extraordinary variation in mutation frequency and spectrum within cancer types, which sheds light on mutational processes and disease aetiology, and in mutation frequency across the genome, which is strongly correlated with DNA replication timing and also with transcriptional activity. By incorporating mutational heterogeneity into the analyses, MutSigCV is able to eliminate most of the apparent artefactual findings and enable the identification of genes truly associated with cancer.


Subject(s)
Genetic Heterogeneity , Mutation/genetics , Neoplasms/genetics , Oncogenes/genetics , Artifacts , DNA Replication Timing , Exome/genetics , False Positive Reactions , Gene Expression , Genome, Human/genetics , Humans , Lung Neoplasms/genetics , Mutation Rate , Neoplasms/classification , Neoplasms/pathology , Neoplasms, Squamous Cell/genetics , Reproducibility of Results , Sample Size
11.
Science ; 339(6122): 957-9, 2013 Feb 22.
Article in English | MEDLINE | ID: mdl-23348506

ABSTRACT

Systematic sequencing of human cancer genomes has identified many recurrent mutations in the protein-coding regions of genes but rarely in gene regulatory regions. Here, we describe two independent mutations within the core promoter of telomerase reverse transcriptase (TERT), the gene coding for the catalytic subunit of telomerase, which collectively occur in 50 of 70 (71%) melanomas examined. These mutations generate de novo consensus binding motifs for E-twenty-six (ETS) transcription factors, and in reporter assays, the mutations increased transcriptional activity from the TERT promoter by two- to fourfold. Examination of 150 cancer cell lines derived from diverse tumor types revealed the same mutations in 24 cases (16%), with preliminary evidence of elevated frequency in bladder and hepatocellular cancer cells. Thus, somatic mutations in regulatory regions of the genome may represent an important tumorigenic mechanism.


Subject(s)
Gene Expression Regulation, Neoplastic , Melanoma/genetics , Mutation , Promoter Regions, Genetic , Telomerase/genetics , Binding Sites , Carcinoma, Hepatocellular/genetics , Cell Line, Tumor , Cell Transformation, Neoplastic , Humans , Liver Neoplasms/genetics , Proto-Oncogene Proteins c-ets/metabolism , Telomerase/chemistry , Telomerase/metabolism , Transcription, Genetic
12.
Biochem Mol Biol Educ ; 40(6): 400-1, 2012.
Article in English | MEDLINE | ID: mdl-23166030

ABSTRACT

3D visualization assists in identifying diverse mechanisms of protein-DNA recognition that can be observed for transcription factors and other DNA binding proteins. We used Proteopedia to illustrate transcription factor-DNA readout modes with a focus on DNA shape, which can be a function of either nucleotide sequence (Hox proteins) or base pairing geometry (p53). © 2012 by The International Union of Biochemistry and Molecular Biology.


Subject(s)
Biochemistry/education , DNA/chemistry , Homeodomain Proteins/chemistry , Imaging, Three-Dimensional , Models, Molecular , Molecular Sequence Annotation , Tumor Suppressor Protein p53/chemistry , Animals , Biochemistry/methods , DNA/metabolism , Homeodomain Proteins/metabolism , Humans , Tumor Suppressor Protein p53/metabolism
14.
Cell ; 150(6): 1107-20, 2012 Sep 14.
Article in English | MEDLINE | ID: mdl-22980975

ABSTRACT

Lung adenocarcinoma, the most common subtype of non-small cell lung cancer, is responsible for more than 500,000 deaths per year worldwide. Here, we report exome and genome sequences of 183 lung adenocarcinoma tumor/normal DNA pairs. These analyses revealed a mean exonic somatic mutation rate of 12.0 events/megabase and identified the majority of genes previously reported as significantly mutated in lung adenocarcinoma. In addition, we identified statistically recurrent somatic mutations in the splicing factor gene U2AF1 and truncating mutations affecting RBM10 and ARID1A. Analysis of nucleotide context-specific mutation signatures grouped the sample set into distinct clusters that correlated with smoking history and alterations of reported lung adenocarcinoma genes. Whole-genome sequence analysis revealed frequent structural rearrangements, including in-frame exonic alterations within EGFR and SIK2 kinases. The candidate genes identified in this study are attractive targets for biological characterization and therapeutic targeting of lung adenocarcinoma.


Subject(s)
Adenocarcinoma/genetics , Carcinoma, Non-Small-Cell Lung/genetics , Genes, Neoplasm , High-Throughput Nucleotide Sequencing , Lung Neoplasms/genetics , Adenocarcinoma/pathology , Adenocarcinoma of Lung , Adult , Aged , Aged, 80 and over , Carcinoma, Non-Small-Cell Lung/pathology , Cohort Studies , Exome , Female , Genome-Wide Association Study , Humans , Lung Neoplasms/pathology , Male , Middle Aged , Mutation , Mutation Rate
15.
Cell ; 150(2): 251-63, 2012 Jul 20.
Article in English | MEDLINE | ID: mdl-22817889

ABSTRACT

Despite recent insights into melanoma genetics, systematic surveys for driver mutations are challenged by an abundance of passenger mutations caused by carcinogenic UV light exposure. We developed a permutation-based framework to address this challenge, employing mutation data from intronic sequences to control for passenger mutational load on a per gene basis. Analysis of large-scale melanoma exome data by this approach discovered six novel melanoma genes (PPP6C, RAC1, SNX31, TACC1, STK19, and ARID2), three of which-RAC1, PPP6C, and STK19-harbored recurrent and potentially targetable mutations. Integration with chromosomal copy number data contextualized the landscape of driver mutations, providing oncogenic insights in BRAF- and NRAS-driven melanoma as well as those without known NRAS/BRAF mutations. The landscape also clarified a mutational basis for RB and p53 pathway deregulation in this malignancy. Finally, the spectrum of driver mutations provided unequivocal genomic evidence for a direct mutagenic role of UV light in melanoma pathogenesis.


Subject(s)
Genome-Wide Association Study , Melanoma/genetics , Mutagenesis , Ultraviolet Rays , Amino Acid Sequence , Cells, Cultured , Exome , Humans , Melanocytes/metabolism , Models, Molecular , Molecular Sequence Data , Proto-Oncogene Proteins B-raf/genetics , Sequence Alignment , rac1 GTP-Binding Protein/genetics
16.
Nature ; 485(7399): 502-6, 2012 May 09.
Article in English | MEDLINE | ID: mdl-22622578

ABSTRACT

Melanoma is notable for its metastatic propensity, lethality in the advanced setting and association with ultraviolet exposure early in life. To obtain a comprehensive genomic view of melanoma in humans, we sequenced the genomes of 25 metastatic melanomas and matched germline DNA. A wide range of point mutation rates was observed: lowest in melanomas whose primaries arose on non-ultraviolet-exposed hairless skin of the extremities (3 and 14 per megabase (Mb) of genome), intermediate in those originating from hair-bearing skin of the trunk (5-55 per Mb), and highest in a patient with a documented history of chronic sun exposure (111 per Mb). Analysis of whole-genome sequence data identified PREX2 (phosphatidylinositol-3,4,5-trisphosphate-dependent Rac exchange factor 2)--a PTEN-interacting protein and negative regulator of PTEN in breast cancer--as a significantly mutated gene with a mutation frequency of approximately 14% in an independent extension cohort of 107 human melanomas. PREX2 mutations are biologically relevant, as ectopic expression of mutant PREX2 accelerated tumour formation of immortalized human melanocytes in vivo. Thus, whole-genome sequencing of human melanoma tumours revealed genomic evidence of ultraviolet pathogenesis and discovered a new recurrently mutated gene in melanoma.


Subject(s)
Genome, Human/genetics , Guanine Nucleotide Exchange Factors/genetics , Melanoma/genetics , Mutation/genetics , Sunlight/adverse effects , Chromosome Breakpoints/radiation effects , DNA Damage , DNA Mutational Analysis , Gene Expression Regulation, Neoplastic , Guanine Nucleotide Exchange Factors/metabolism , Humans , Melanocytes/metabolism , Melanocytes/pathology , Melanoma/pathology , Mutagenesis/radiation effects , Mutation/radiation effects , Oncogenes/genetics , Ultraviolet Rays/adverse effects
17.
J Struct Biol ; 175(2): 244-52, 2011 Aug.
Article in English | MEDLINE | ID: mdl-21536137

ABSTRACT

Proteopedia is a collaborative, 3D web-encyclopedia of protein, nucleic acid and other biomolecule structures. Created as a means for communicating biomolecule structures to a diverse scientific audience, Proteopedia (http://www.proteopedia.org) presents structural annotation in an intuitive, interactive format and allows members of the scientific community to easily contribute their own annotations. Here, we provide a status report on Proteopedia by describing advances in the web resource since its inception three and a half years ago, focusing on features of potential direct use to the scientific community. We discuss its progress as a collaborative 3D-encyclopedia of structures as well as its use as a complement to scientific publications and PowerPoint presentations. We also describe Proteopedia's use for 3D visualization in structure-related pedagogy.


Subject(s)
Encyclopedias as Topic , Online Systems , Protein Conformation , Proteins/chemistry , Information Dissemination/methods , Information Management , Information Services , Models, Molecular , Molecular Biology/education , User-Computer Interface
18.
Genome Res ; 20(11): 1582-9, 2010 Nov.
Article in English | MEDLINE | ID: mdl-20841429

ABSTRACT

Genomes encode multiple signals, raising the question of how these different codes are organized along the linear genome sequence. Within protein-coding regions, the redundancy of the genetic code can, in principle, allow for the overlapping encoding of signals in addition to the amino acid sequence, but it is not known to what extent genomes exploit this potential and, if so, for what purpose. Here, we systematically explore whether protein-coding regions accommodate overlapping codes, by comparing the number of occurrences of each possible short sequence within the protein-coding regions of over 700 species from viruses to plants, to the same number in randomizations that preserve amino acid sequence and codon bias. We find that coding regions across all phyla encode additional information, with bacteria carrying more information than eukaryotes. The detailed signals consist of both known and potentially novel codes, including position-dependent secondary RNA structure, bacteria-specific depletion of transcription and translation initiation signals, and eukaryote-specific enrichment of microRNA target sites. Our results suggest that genomes may have evolved to encode extensive overlapping information within protein-coding regions.


Subject(s)
Amino Acid Sequence/genetics , Base Sequence , Genetic Code/physiology , Open Reading Frames/genetics , Sequence Alignment , Animals , Bacterial Proteins/genetics , Bacterial Proteins/metabolism , Eukaryotic Cells/metabolism , Genetic Code/genetics , Humans , Markov Chains , Models, Biological , Monte Carlo Method , Nucleic Acid Conformation , Open Reading Frames/physiology , Phylogeny , Proteins/genetics , Proteins/metabolism , RNA/chemistry , RNA/genetics , RNA/metabolism
SELECTION OF CITATIONS
SEARCH DETAIL
...