Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 27
Filter
1.
Prostate Cancer Prostatic Dis ; 22(4): 531-538, 2019 12.
Article in English | MEDLINE | ID: mdl-30804427

ABSTRACT

BACKGROUND: Metastatic castration resistant prostate cancer (mCRPC) is incurable and progression after drugs that target the androgen receptor-signaling axis is inevitable. Thus, there is an urgent need to develop more effective treatments beyond hormonal manipulation. We sought to identify activated kinases in mCRPC as therapeutic targets for existing, approved agents, with the goal of identifying candidate drugs for rapid translation into proof of concept Phase II trials in mCRPC. METHODS: To identify evidence of activation of druggable kinases in these patients, we compared mRNA expression from metastatic biopsies of patients with mCRPC (n = 101) to mRNA expression in localized prostate from TCGA and used this analysis to infer differential kinase activity. In addition, we assessed the differential phosphorylation levels for key MAPK pathway kinases between mCRPC and localized prostate cancers. RESULTS: Transcriptomic profiling of 101 patients with mCRPC as compared to patients with localized prostate cancer identified evidence of hyperactive ERK1, and whole genome sequencing revealed frequent amplifications of members of the MAPK pathway in 32% of this cohort. Next, we confirmed elevated levels of phosphorylated ERK1/2 in castration resistant prostate cancer as compared to untreated primary prostate cancer. We observed that the presence of detectable phosphorylated ERK1/2 in the primary tumor is associated with biochemical failure after radical prostatectomy independent of clinicopathologic features. ERK1 is the immediate downstream target of MEK1/2, which is druggable with trametinib, an approved therapeutic for melanoma. Trametinib elicited a profound biochemical and clinical response in a patient who had failed multiple prior treatments for mCRPC. CONCLUSIONS: We conclude that pharmacologic targeting of the MEK/ERK pathway may be a viable treatment strategy for patients with refractory metastatic prostate cancer. An ongoing Phase II trial tests this hypothesis.


Subject(s)
MAP Kinase Signaling System/drug effects , Mitogen-Activated Protein Kinase 3/antagonists & inhibitors , Prostatic Neoplasms, Castration-Resistant/drug therapy , Protein Kinase Inhibitors/therapeutic use , Aged , Antineoplastic Agents , Biopsy , Disease-Free Survival , Gene Amplification , Gene Expression Regulation, Neoplastic , Humans , MAP Kinase Signaling System/genetics , Male , Middle Aged , Mitogen-Activated Protein Kinase 3/genetics , Mitogen-Activated Protein Kinase 3/metabolism , Molecular Targeted Therapy/methods , Phosphorylation/drug effects , Prospective Studies , Prostate/pathology , Prostatic Neoplasms, Castration-Resistant/mortality , Prostatic Neoplasms, Castration-Resistant/pathology , Protein Kinase Inhibitors/pharmacology , Pyridones/pharmacology , Pyridones/therapeutic use , Pyrimidinones/pharmacology , Pyrimidinones/therapeutic use , RNA-Seq
2.
BMC Bioinformatics ; 19(1): 341, 2018 Sep 26.
Article in English | MEDLINE | ID: mdl-30257653

ABSTRACT

BACKGROUND: We describe a prototype implementation of a platform that could underlie a Precision Oncology Rapid Learning system. RESULTS: We describe the prototype platform, and examine some important issues and details. In the Appendix we provide a complete walk-through of the prototype platform. CONCLUSIONS: The design choices made in this implementation rest upon ten constitutive hypotheses, which, taken together, define a particular view of how a rapid learning medical platform might be defined, organized, and implemented.


Subject(s)
Medical Oncology , Precision Medicine , Software , Algorithms , Education, Medical , Humans , Publications
3.
Cancer Res ; 77(21): e111-e114, 2017 11 01.
Article in English | MEDLINE | ID: mdl-29092953

ABSTRACT

Vast amounts of molecular data are being collected on tumor samples, which provide unique opportunities for discovering trends within and between cancer subtypes. Such cross-cancer analyses require computational methods that enable intuitive and interactive browsing of thousands of samples based on their molecular similarity. We created a portal called TumorMap to assist in exploration and statistical interrogation of high-dimensional complex "omics" data in an interactive and easily interpretable way. In the TumorMap, samples are arranged on a hexagonal grid based on their similarity to one another in the original genomic space and are rendered with Google's Map technology. While the important feature of this public portal is the ability for the users to build maps from their own data, we pre-built genomic maps from several previously published projects. We demonstrate the utility of this portal by presenting results obtained from The Cancer Genome Atlas project data. Cancer Res; 77(21); e111-4. ©2017 AACR.


Subject(s)
Computational Biology/methods , Genomics/methods , Neoplasms/genetics , Software , Chromosome Mapping/methods , Gene Regulatory Networks/genetics , Genetic Predisposition to Disease/genetics , Genome, Human/genetics , Humans , Mutation , Neoplasms/pathology , Reproducibility of Results , User-Computer Interface
4.
Genome Res ; 27(6): 997-1003, 2017 Jun.
Article in English | MEDLINE | ID: mdl-28298429

ABSTRACT

Rapid species radiation due to adaptive changes or occupation of new ecospaces challenges our understanding of ancestral speciation and the relationships of modern species. At the molecular level, rapid radiation with successive speciations over short time periods-too short to fix polymorphic alleles-is described as incomplete lineage sorting. Incomplete lineage sorting leads to random fixation of genetic markers and hence, random signals of relationships in phylogenetic reconstructions. The situation is further complicated when you consider that the genome is a mosaic of ancestral and modern incompletely sorted sequence blocks that leads to reconstructed affiliations to one or the other relative, depending on the fixation of their shared ancestral polymorphic alleles. The laurasiatherian relationships among Chiroptera, Perissodactyla, Cetartiodactyla, and Carnivora present a prime example for such enigmatic affiliations. We performed whole-genome screenings for phylogenetically diagnostic retrotransposon insertions involving the representatives bat (Chiroptera), horse (Perissodactyla), cow (Cetartiodactyla), and dog (Carnivora), and extracted among 162,000 preselected cases 102 virtually homoplasy-free, phylogenetically informative retroelements to draw a complete picture of the highly complex evolutionary relations within Laurasiatheria. All possible evolutionary scenarios received considerable retrotransposon support, leaving us with a network of affiliations. However, the Cetartiodactyla-Carnivora relationship as well as the basal position of Chiroptera and an ancestral laurasiatherian hybridization process did exhibit some very clear, distinct signals. The significant accordance of retrotransposon presence/absence patterns and flanking nucleotide changes suggest an important influence of mosaic genome structures in the reconstruction of species histories.


Subject(s)
Chiroptera/genetics , Genetic Speciation , Genome , Horses/genetics , Phylogeny , Retroelements , Animals , Cattle , Chiroptera/classification , Chromosome Mapping , Dogs , Genetic Markers , Horses/classification , Hybridization, Genetic , Mutagenesis, Insertional , Sequence Analysis, DNA , Software
5.
Nat Commun ; 7: 12997, 2016 10 06.
Article in English | MEDLINE | ID: mdl-27708261

ABSTRACT

Tarsiers are phylogenetically located between the most basal strepsirrhines and the most derived anthropoid primates. While they share morphological features with both groups, they also possess uncommon primate characteristics, rendering their evolutionary history somewhat obscure. To investigate the molecular basis of such attributes, we present here a new genome assembly of the Philippine tarsier (Tarsius syrichta), and provide extended analyses of the genome and detailed history of transposable element insertion events. We describe the silencing of Alu monomers on the lineage leading to anthropoids, and recognize an unexpected abundance of long terminal repeat-derived and LINE1-mobilized transposed elements (Tarsius interspersed elements; TINEs). For the first time in mammals, we identify a complete mitochondrial genome insertion within the nuclear genome, then reveal tarsier-specific, positive gene selection and posit population size changes over time. The genomic resources and analyses presented here will aid efforts to more fully understand the ancient characteristics of primate genomes.


Subject(s)
Gene Silencing , Genome, Mitochondrial , Genome , Long Interspersed Nucleotide Elements , Tarsiidae/genetics , Animals , Brain/metabolism , Cell Nucleus/metabolism , DNA Transposable Elements , Female , Markov Chains , MicroRNAs/metabolism , Mitochondria/metabolism , Muscles/metabolism , Phylogeny , RNA, Small Nucleolar/metabolism
6.
Cancer Cell ; 29(4): 536-547, 2016 Apr 11.
Article in English | MEDLINE | ID: mdl-27050099

ABSTRACT

MYCN amplification and overexpression are common in neuroendocrine prostate cancer (NEPC). However, the impact of aberrant N-Myc expression in prostate tumorigenesis and the cellular origin of NEPC have not been established. We define N-Myc and activated AKT1 as oncogenic components sufficient to transform human prostate epithelial cells to prostate adenocarcinoma and NEPC with phenotypic and molecular features of aggressive, late-stage human disease. We directly show that prostate adenocarcinoma and NEPC can arise from a common epithelial clone. Further, N-Myc is required for tumor maintenance, and destabilization of N-Myc through Aurora A kinase inhibition reduces tumor burden. Our findings establish N-Myc as a driver of NEPC and a target for therapeutic intervention.


Subject(s)
Adenocarcinoma/genetics , Cell Transformation, Neoplastic/genetics , Epithelial Cells/pathology , Neoplasm Proteins/physiology , Neuroendocrine Tumors/genetics , Prostatic Neoplasms/genetics , Proto-Oncogene Proteins c-myc/physiology , Adenocarcinoma/pathology , Animals , Antineoplastic Agents/therapeutic use , Aurora Kinase A/antagonists & inhibitors , Aurora Kinase A/physiology , Azepines/therapeutic use , Cell Line, Tumor , Enzyme Activation , Epithelial Cells/metabolism , Exome , Gene Expression Regulation, Neoplastic , Genes, myc , Humans , Laser Capture Microdissection , Male , Mice, Inbred NOD , Mice, SCID , Molecular Targeted Therapy , Neoplasm Invasiveness , Neoplasm Metastasis , Neoplasm Proteins/genetics , Neoplastic Stem Cells/metabolism , Neoplastic Stem Cells/pathology , Neuroendocrine Tumors/pathology , Orchiectomy , Phenylurea Compounds/therapeutic use , Prostatic Neoplasms/pathology , Protein Kinase Inhibitors/therapeutic use , Proto-Oncogene Proteins c-akt/physiology , Pyrimidines/therapeutic use , Recombinant Fusion Proteins/metabolism , Transduction, Genetic , Xenograft Model Antitumor Assays
7.
PLoS Comput Biol ; 12(3): e1004790, 2016 Mar.
Article in English | MEDLINE | ID: mdl-26960204

ABSTRACT

We present a novel regularization scheme called The Generalized Elastic Net (GELnet) that incorporates gene pathway information into feature selection. The proposed formulation is applicable to a wide variety of problems in which the interpretation of predictive features using known molecular interactions is desired. The method naturally steers solutions toward sets of mechanistically interlinked genes. Using experiments on synthetic data, we demonstrate that pathway-guided results maintain, and often improve, the accuracy of predictors even in cases where the full gene network is unknown. We apply the method to predict the drug response of breast cancer cell lines. GELnet is able to reveal genetic determinants of sensitivity and resistance for several compounds. In particular, for an EGFR/HER2 inhibitor, it finds a possible trans-differentiation resistance mechanism missed by the corresponding pathway agnostic approach.


Subject(s)
Chromosome Mapping/methods , Models, Genetic , Pattern Recognition, Automated/methods , Protein Interaction Mapping/methods , Proteome/genetics , Signal Transduction/genetics , Animals , Computer Simulation , Humans
8.
Proc Natl Acad Sci U S A ; 112(47): E6544-52, 2015 Nov 24.
Article in English | MEDLINE | ID: mdl-26460041

ABSTRACT

Evidence from numerous cancers suggests that increased aggressiveness is accompanied by up-regulation of signaling pathways and acquisition of properties common to stem cells. It is unclear if different subtypes of late-stage cancer vary in stemness properties and whether or not these subtypes are transcriptionally similar to normal tissue stem cells. We report a gene signature specific for human prostate basal cells that is differentially enriched in various phenotypes of late-stage metastatic prostate cancer. We FACS-purified and transcriptionally profiled basal and luminal epithelial populations from the benign and cancerous regions of primary human prostates. High-throughput RNA sequencing showed the basal population to be defined by genes associated with stem cell signaling programs and invasiveness. Application of a 91-gene basal signature to gene expression datasets from patients with organ-confined or hormone-refractory metastatic prostate cancer revealed that metastatic small cell neuroendocrine carcinoma was molecularly more stem-like than either metastatic adenocarcinoma or organ-confined adenocarcinoma. Bioinformatic analysis of the basal cell and two human small cell gene signatures identified a set of E2F target genes common between prostate small cell neuroendocrine carcinoma and primary prostate basal cells. Taken together, our data suggest that aggressive prostate cancer shares a conserved transcriptional program with normal adult prostate basal stem cells.


Subject(s)
Gene Expression Profiling , Prostatic Neoplasms/genetics , Prostatic Neoplasms/pathology , Stem Cells/metabolism , Antigens, CD/metabolism , Epithelial Cells/metabolism , Female , Gene Expression Regulation, Neoplastic , Gene Regulatory Networks , Humans , Male , Mammary Glands, Human/cytology , Neoplasm Metastasis , Neuroendocrine Tumors/genetics , Neuroendocrine Tumors/pathology , Phenotype , Proto-Oncogene Proteins c-myc/metabolism , Sequence Analysis, RNA , Signal Transduction/genetics , Transcription Factors/metabolism
9.
Mol Biol Evol ; 32(12): 3194-204, 2015 Dec.
Article in English | MEDLINE | ID: mdl-26337548

ABSTRACT

Freed from the competition of large raptors, Paleocene carnivores could expand their newly acquired habitats in search of prey. Such changing conditions might have led to their successful distribution and rapid radiation. Today, molecular evolutionary biologists are faced, however, with the consequences of such accelerated adaptive radiations, because they led to sequential speciation more rapidly than phylogenetic markers could be fixed. The repercussions being that current genealogies based on such markers are incongruent with species trees.Our aim was to explore such conflicting phylogenetic zones of evolution during the early arctoid radiation, especially to distinguish diagnostic from misleading phylogenetic signals, and to examine other carnivore-related speciation events. We applied a combination of high-throughput computational strategies to screen carnivore and related genomes in silico for randomly inserted retroposed elements that we then used to identify inconsistent phylogenetic patterns in the Arctoidea group, which is well known for phylogenetic discordances.Our combined retrophylogenomic and in vitro wet lab approach detected hundreds of carnivore-specific insertions, many of them confirming well-established splits or identifying and solving conflicting species distributions. Our systematic genome-wide screens for Long INterspersed Elements detected homoplasy-free markers with insertion-specific truncation points that we used to distinguish phylogenetically informative markers from conflicting signals. The results were independently confirmed by phylogenetic diagnostic Short INterspersed Elements. As statistical analysis ruled out ancestral hybridization, these doubly verified but still conflicting patterns were statistically determined to be genomic remnants from a time of ancestral incomplete lineage sorting that especially accompanied large parts of Arctoidea evolution.


Subject(s)
Carnivora/genetics , Animals , Biological Evolution , Evolution, Molecular , Genetic Speciation , Genomics , Hybridization, Genetic , Long Interspersed Nucleotide Elements , Molecular Sequence Data , Phylogeny , Short Interspersed Nucleotide Elements
10.
Nucleic Acids Res ; 42(Database issue): D865-72, 2014 Jan.
Article in English | MEDLINE | ID: mdl-24217909

ABSTRACT

The Consensus Coding Sequence (CCDS) project (http://www.ncbi.nlm.nih.gov/CCDS/) is a collaborative effort to maintain a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assemblies by the National Center for Biotechnology Information (NCBI) and Ensembl genome annotation pipelines. Identical annotations that pass quality assurance tests are tracked with a stable identifier (CCDS ID). Members of the collaboration, who are from NCBI, the Wellcome Trust Sanger Institute and the University of California Santa Cruz, provide coordinated and continuous review of the dataset to ensure high-quality CCDS representations. We describe here the current status and recent growth in the CCDS dataset, as well as recent changes to the CCDS web and FTP sites. These changes include more explicit reporting about the NCBI and Ensembl annotation releases being compared, new search and display options, the addition of biologically descriptive information and our approach to representing genes for which support evidence is incomplete. We also present a summary of recent and future curation targets.


Subject(s)
Databases, Genetic , Proteins/genetics , Animals , Exons , Genomics , Humans , Internet , Mice , Molecular Sequence Annotation , Sequence Analysis
11.
Genome Res ; 19(12): 2324-33, 2009 Dec.
Article in English | MEDLINE | ID: mdl-19767417

ABSTRACT

Since its start, the Mammalian Gene Collection (MGC) has sought to provide at least one full-protein-coding sequence cDNA clone for every human and mouse gene with a RefSeq transcript, and at least 6200 rat genes. The MGC cloning effort initially relied on random expressed sequence tag screening of cDNA libraries. Here, we summarize our recent progress using directed RT-PCR cloning and DNA synthesis. The MGC now contains clones with the entire protein-coding sequence for 92% of human and 89% of mouse genes with curated RefSeq (NM-accession) transcripts, and for 97% of human and 96% of mouse genes with curated RefSeq transcripts that have one or more PubMed publications, in addition to clones for more than 6300 rat genes. These high-quality MGC clones and their sequences are accessible without restriction to researchers worldwide.


Subject(s)
Cloning, Molecular/methods , Computational Biology/methods , DNA, Complementary/genetics , Gene Library , Genes/genetics , Mammals/genetics , Animals , DNA/biosynthesis , Humans , Mice , National Institutes of Health (U.S.) , Rats , Reverse Transcriptase Polymerase Chain Reaction , United States
12.
Genome Res ; 19(7): 1316-23, 2009 Jul.
Article in English | MEDLINE | ID: mdl-19498102

ABSTRACT

Effective use of the human and mouse genomes requires reliable identification of genes and their products. Although multiple public resources provide annotation, different methods are used that can result in similar but not identical representation of genes, transcripts, and proteins. The collaborative consensus coding sequence (CCDS) project tracks identical protein annotations on the reference mouse and human genomes with a stable identifier (CCDS ID), and ensures that they are consistently represented on the NCBI, Ensembl, and UCSC Genome Browsers. Importantly, the project coordinates on manually reviewing inconsistent protein annotations between sites, as well as annotations for which new evidence suggests a revision is needed, to progressively converge on a complete protein-coding set for the human and mouse reference genomes, while maintaining a high standard of reliability and biological accuracy. To date, the project has identified 20,159 human and 17,707 mouse consensus coding regions from 17,052 human and 16,893 mouse genes. Three evaluation methods indicate that the entries in the CCDS set are highly likely to represent real proteins, more so than annotations from contributing groups not included in CCDS. The CCDS database thus centralizes the function of identifying well-supported, identically-annotated, protein-coding regions.


Subject(s)
Consensus Sequence , Genome , Open Reading Frames/genetics , Animals , Humans , Mice , Sequence Alignment
13.
Genome Res ; 19(5): 868-75, 2009 May.
Article in English | MEDLINE | ID: mdl-19261842

ABSTRACT

One and a half centuries after Charles Darwin and Alfred Russel Wallace outlined our current understanding of evolution, a new scientific era is dawning that enables direct observations of genetic variation. However, pure sequence-based molecular attempts to resolve the basal origin of placental mammals have so far resulted only in apparently conflicting hypotheses. By contrast, in the mammalian genomes where they were highly active, the insertion of retroelements and their comparative insertion patterns constitute a neutral, virtually homoplasy-free archive of evolutionary histories. The "presence" of a retroelement at an orthologous genomic position in two species indicates their common ancestry in contrast to its "absence" in more distant species. To resolve the placental origin controversy we extracted approximately 2 million potentially phylogenetically informative, retroposon-containing loci from representatives of the major placental mammalian lineages and found highly significant evidence challenging all current single hypotheses of their basal origin. The Exafroplacentalia hypothesis (Afrotheria as the sister group to all remaining placentals) is significantly supported by five retroposon insertions, the Epitheria hypothesis (Xenarthra as the sister group to all remaining placentals) by nine insertion patterns, and the Atlantogenata hypothesis (a monophyletic clade comprising Xenarthra and Afrotheria as the sister group to Boreotheria comprising all remaining placentals) by eight insertion patterns. These findings provide significant support for a "soft" polytomy of the major mammalian clades. Ancestral successive hybridization events and/or incomplete lineage sorting associated with short speciation intervals are viable explanations for the mosaic retroposon insertion patterns of recent placental mammals and for the futile search for a clear root dichotomy.


Subject(s)
Mammals/genetics , Retroelements/genetics , Animals , Base Sequence , Evolution, Molecular , Female , Genome , Humans , Mammals/classification , Molecular Sequence Data , Mutagenesis, Insertional , Phylogeny , Placenta/metabolism , Sequence Alignment , Xenarthra/classification , Xenarthra/genetics
14.
BMC Genomics ; 9: 466, 2008 Oct 08.
Article in English | MEDLINE | ID: mdl-18842134

ABSTRACT

BACKGROUND: Evolution via point mutations is a relatively slow process and is unlikely to completely explain the differences between primates and other mammals. By contrast, 45% of the human genome is composed of retroposed elements, many of which were inserted in the primate lineage. A subset of retroposed mRNAs (retrocopies) shows strong evidence of expression in primates, often yielding functional retrogenes. RESULTS: To identify and analyze the relatively recently evolved retrogenes, we carried out BLASTZ alignments of all human mRNAs against the human genome and scored a set of features indicative of retroposition. Of over 12,000 putative retrocopy-derived genes that arose mainly in the primate lineage, 726 with strong evidence of transcript expression were examined in detail. These mRNA retroposition events fall into three categories: I) 34 retrocopies and antisense retrocopies that added potential protein coding space and UTRs to existing genes; II) 682 complete retrocopy duplications inserted into new loci; and III) an unexpected set of 13 retrocopies that contributed out-of-frame, or antisense sequences in combination with other types of transposed elements (SINEs, LINEs, LTRs), even unannotated sequence to form potentially novel genes with no homologs outside primates. In addition to their presence in human, several of the gene candidates also had potentially viable ORFs in chimpanzee, orangutan, and rhesus macaque, underscoring their potential of function. CONCLUSION: mRNA-derived retrocopies provide raw material for the evolution of genes in a wide variety of ways, duplicating and amending the protein coding region of existing genes as well as generating the potential for new protein coding space, or non-protein coding RNAs, by unexpected contributions out of frame, in reverse orientation, or from previously non-protein coding sequence.


Subject(s)
Evolution, Molecular , Genome, Human , Retroelements , Animals , Exons , Gene Duplication , Humans , Primates/genetics
15.
Bioinformatics ; 24(5): 637-44, 2008 Mar 01.
Article in English | MEDLINE | ID: mdl-18218656

ABSTRACT

MOTIVATION: Computational annotation of protein coding genes in genomic DNA is a widely used and essential tool for analyzing newly sequenced genomes. However, current methods suffer from inaccuracy and do poorly with certain types of genes. Including additional sources of evidence of the existence and structure of genes can improve the quality of gene predictions. For many eukaryotic genomes, expressed sequence tags (ESTs) are available as evidence for genes. Related genomes that have been sequenced, annotated, and aligned to the target genome provide evidence of existence and structure of genes. RESULTS: We incorporate several different evidence sources into the gene finder AUGUSTUS. The sources of evidence are gene and transcript annotations from related species syntenically mapped to the target genome using TransMap, evolutionary conservation of DNA, mRNA and ESTs of the target species, and retroposed genes. The predictions include alternative splice variants where evidence supports it. Using only ESTs we were able to correctly predict at least one splice form exactly correct in 57% of human genes. Also using evidence from other species and human mRNAs, this number rises to 77%. Syntenic mapping is well-suited to annotate genomes closely related to genomes that are already annotated or for which extensive transcript evidence is available. Native cDNA evidence is most helpful when the alignments are used as compound information rather than independent positionwise information. AVAILABILITY: AUGUSTUS is open source and available at http://augustus.gobics.de. The gene predictions for human can be browsed and downloaded at the UCSC Genome Browser (http://genome.ucsc.edu).


Subject(s)
DNA, Complementary/genetics , Sequence Alignment , Alternative Splicing , Animals , Expressed Sequence Tags , Humans
16.
Genome Res ; 17(12): 1797-808, 2007 Dec.
Article in English | MEDLINE | ID: mdl-17984227

ABSTRACT

This article describes a set of alignments of 28 vertebrate genome sequences that is provided by the UCSC Genome Browser. The alignments can be viewed on the Human Genome Browser (March 2006 assembly) at http://genome.ucsc.edu, downloaded in bulk by anonymous FTP from http://hgdownload.cse.ucsc.edu/goldenPath/hg18/multiz28way, or analyzed with the Galaxy server at http://g2.bx.psu.edu. This article illustrates the power of this resource for exploring vertebrate and mammalian evolution, using three examples. First, we present several vignettes involving insertions and deletions within protein-coding regions, including a look at some human-specific indels. Then we study the extent to which start codons and stop codons in the human sequence are conserved in other species, showing that start codons are in general more poorly conserved than stop codons. Finally, an investigation of the phylogenetic depth of conservation for several classes of functional elements in the human genome reveals striking differences in the rates and modes of decay in alignability. Each functional class has a distinctive period of stringent constraint, followed by decays that allow (for the case of regulatory regions) or reject (for coding regions and ultraconserved elements) insertions and deletions.


Subject(s)
Conserved Sequence , Databases, Genetic , Sequence Alignment/methods , Animals , Base Sequence , Cats , Cattle , Codon, Initiator/genetics , Codon, Terminator/genetics , Dogs , Genome, Human , Guinea Pigs , Humans , Mice , Molecular Sequence Data , Mutagenesis, Insertional , Rabbits , Rats , Sequence Deletion
17.
Nucleic Acids Res ; 35(Web Server issue): W152-8, 2007 Jul.
Article in English | MEDLINE | ID: mdl-17545196

ABSTRACT

GeneHub-GEPIS is a web application that performs digital expression analysis in human and mouse tissues based on an integrated gene database. Using aggregated expressed sequence tag (EST) library information and EST counts, the application calculates the normalized gene expression levels across a large panel of normal and tumor tissues, thus providing rapid expression profiling for a given gene. The backend GeneHub component of the application contains pre-defined gene structures derived from mRNA transcript sequences from major databases and includes extensive cross references for commonly used gene identifiers. ESTs are then linked to genes based on their precise genomic locations as determined by GMAP. This genome-based approach reduces incorrect matches between ESTs and genes, thus minimizing the noise seen with previous tools. In addition, the gene-centric design makes it possible to add several important features, including text searching capabilities, the ability to accept diverse input values, expression analysis for microRNAs, basic gene annotation, batch analysis and linking between mouse and human genes. GeneHub-GEPIS is available at http://www.cgl.ucsf.edu/Research/genentech/genehub-gepis/ or http://www.gepis.org/.


Subject(s)
Algorithms , Chromosome Mapping/methods , Gene Expression Profiling/methods , Neoplasms/genetics , Sequence Analysis, DNA/methods , Software , User-Computer Interface , Biomarkers, Tumor/genetics , Expressed Sequence Tags , Genetic Testing/methods , Humans , Internet , Neoplasms/diagnosis , Online Systems , Sequence Alignment/methods
18.
Genome Res ; 17(6): 839-51, 2007 Jun.
Article in English | MEDLINE | ID: mdl-17568002

ABSTRACT

Arising from either retrotransposition or genomic duplication of functional genes, pseudogenes are "genomic fossils" valuable for exploring the dynamics and evolution of genes and genomes. Pseudogene identification is an important problem in computational genomics, and is also critical for obtaining an accurate picture of a genome's structure and function. However, no consensus computational scheme for defining and detecting pseudogenes has been developed thus far. As part of the ENCyclopedia Of DNA Elements (ENCODE) project, we have compared several distinct pseudogene annotation strategies and found that different approaches and parameters often resulted in rather distinct sets of pseudogenes. We subsequently developed a consensus approach for annotating pseudogenes (derived from protein coding genes) in the ENCODE regions, resulting in 201 pseudogenes, two-thirds of which originated from retrotransposition. A survey of orthologs for these pseudogenes in 28 vertebrate genomes showed that a significant fraction ( approximately 80%) of the processed pseudogenes are primate-specific sequences, highlighting the increasing retrotransposition activity in primates. Analysis of sequence conservation and variation also demonstrated that most pseudogenes evolve neutrally, and processed pseudogenes appear to have lost their coding potential immediately or soon after their emergence. In order to explore the functional implication of pseudogene prevalence, we have extensively examined the transcriptional activity of the ENCODE pseudogenes. We performed systematic series of pseudogene-specific RACE analyses. These, together with complementary evidence derived from tiling microarrays and high throughput sequencing, demonstrated that at least a fifth of the 201 pseudogenes are transcribed in one or more cell lines or tissues.


Subject(s)
Evolution, Molecular , Gene Duplication , Pseudogenes , Transcription, Genetic , Animals , Cell Line , Humans , Primates/genetics , Retroelements , Sequence Analysis, DNA , Species Specificity
19.
PLoS Genet ; 2(10): e168, 2006 Oct 13.
Article in English | MEDLINE | ID: mdl-17040131

ABSTRACT

Comparative genomics allow us to search the human genome for segments that were extensively changed in the last approximately 5 million years since divergence from our common ancestor with chimpanzee, but are highly conserved in other species and thus are likely to be functional. We found 202 genomic elements that are highly conserved in vertebrates but show evidence of significantly accelerated substitution rates in human. These are mostly in non-coding DNA, often near genes associated with transcription and DNA binding. Resequencing confirmed that the five most accelerated elements are dramatically changed in human but not in other primates, with seven times more substitutions in human than in chimp. The accelerated elements, and in particular the top five, show a strong bias for adenine and thymine to guanine and cytosine nucleotide changes and are disproportionately located in high recombination and high guanine and cytosine content environments near telomeres, suggesting either biased gene conversion or isochore selection. In addition, there is some evidence of directional selection in the regions containing the two most accelerated regions. A combination of evolutionary forces has contributed to accelerated evolution of the fastest evolving elements in the human genome.


Subject(s)
Evolution, Molecular , Genome, Human/genetics , Selection, Genetic , Animals , Base Pairing , Base Sequence , Conserved Sequence , Humans , Molecular Sequence Data , Recombination, Genetic , Regulatory Elements, Transcriptional/genetics , Sequence Analysis, DNA , Species Specificity
20.
J Biol Chem ; 281(42): 31885-93, 2006 Oct 20.
Article in English | MEDLINE | ID: mdl-16912037

ABSTRACT

The vacuolar ATPase has been implicated in a variety of physiological processes in eukaryotic cells. Bafilomycin and concanamycin, highly potent and specific inhibitors of the vacuolar ATPase, have been widely used to investigate the enzyme. Derivatives have been developed as possible therapeutic drugs. We have used random mutagenesis and site-directed mutagenesis to identify 23 residues in the c subunit involved in binding these drugs. We generated a model for the structure of the ring of c subunits in Neurospora crassa by using data from the crystal structure of the homologous subunits of the bacterium Enterococcus hirae (Murata, T., Yamato, I., Kakinuma, Y., Leslie, A. G., and Walker, J. E. (2005) Science 308, 654-659). In the model 10 of the 11 mutation sites that confer the highest degree of resistance are closely clustered. They form a putative drug-binding pocket at the interface between helices 1 and 2 on one c subunit and helix 4 of the adjacent c subunit. The excellent fit of the N. crassa sequence to the E. hirae structure and the degree to which the structural model predicts the clustering of these residues suggest that the folding of the bacterial and eukaryotic polypeptides is very similar.


Subject(s)
Macrolides/chemistry , Neurospora crassa/enzymology , Vacuolar Proton-Translocating ATPases/chemistry , Adenosine Triphosphate/chemistry , Amino Acid Sequence , Binding Sites , Enterococcus/metabolism , Enzyme Inhibitors/pharmacology , Models, Molecular , Molecular Sequence Data , Neurospora crassa/metabolism , Protein Folding , Sequence Homology, Amino Acid
SELECTION OF CITATIONS
SEARCH DETAIL
...