Search | VHL Regional Portal

1.

Development of a clinical polygenic risk score assay and reporting workflow.

Hao, Limin; Kraft, Peter; Berriz, Gabriel F; Hynes, Elizabeth D; Koch, Christopher; Korategere V Kumar, Prathik; Parpattedar, Shruti S; Steeves, Marcie; Yu, Wanfeng; Antwi, Ashley A; Brunette, Charles A; Danowski, Morgan; Gala, Manish K; Green, Robert C; Jones, Natalie E; Lewis, Anna C F; Lubitz, Steven A; Natarajan, Pradeep; Vassy, Jason L; Lebo, Matthew S.

Nat Med ; 28(5): 1006-1013, 2022 05.

Article in English | MEDLINE | ID: mdl-35437332

ABSTRACT

Implementation of polygenic risk scores (PRS) may improve disease prevention and management but poses several challenges: the construction of clinically valid assays, interpretation for individual patients, and the development of clinical workflows and resources to support their use in patient care. For the ongoing Veterans Affairs Genomic Medicine at Veterans Affairs (GenoVA) Study we developed a clinical genotype array-based assay for six published PRS. We used data from 36,423 Mass General Brigham Biobank participants and adjustment for population structure to replicate known PRS-disease associations and published PRS thresholds for a disease odds ratio (OR) of 2 (ranging from 1.75 (95% CI: 1.57-1.95) for type 2 diabetes to 2.38 (95% CI: 2.07-2.73) for breast cancer). After confirming the high performance and robustness of the pipeline for use as a clinical assay for individual patients, we analyzed the first 227 prospective samples from the GenoVA Study and found that the frequency of PRS corresponding to published OR > 2 ranged from 13/227 (5.7%) for colorectal cancer to 23/150 (15.3%) for prostate cancer. In addition to the PRS laboratory report, we developed physician- and patient-oriented informational materials to support decision-making about PRS results. Our work illustrates the generalizable development of a clinical PRS assay for multiple conditions and the technical, reporting and clinical workflow challenges for implementing PRS information in the clinic.

Subject(s)

Diabetes Mellitus, Type 2 , Genome-Wide Association Study , Genetic Predisposition to Disease , Humans , Male , Prospective Studies , Risk Factors , Workflow

2.

Genome-wide functional analysis of human 5' untranslated region introns.

Cenik, Can; Derti, Adnan; Mellor, Joseph C; Berriz, Gabriel F; Roth, Frederick P.

Genome Biol ; 11(3): R29, 2010.

Article in English | MEDLINE | ID: mdl-20222956

ABSTRACT

BACKGROUND: Approximately 35% of human genes contain introns within the 5' untranslated region (UTR). Introns in 5'UTRs differ from those in coding regions and 3'UTRs with respect to nucleotide composition, length distribution and density. Despite their presumed impact on gene regulation, the evolution and possible functions of 5'UTR introns remain largely unexplored. RESULTS: We performed a genome-scale computational analysis of 5'UTR introns in humans. We discovered that the most highly expressed genes tended to have short 5'UTR introns rather than having long 5'UTR introns or lacking 5'UTR introns entirely. Although we found no correlation in 5'UTR intron presence or length with variance in expression across tissues, which might have indicated a broad role in expression-regulation, we observed an uneven distribution of 5'UTR introns amongst genes in specific functional categories. In particular, genes with regulatory roles were surprisingly enriched in having 5'UTR introns. Finally, we analyzed the evolution of 5'UTR introns in non-receptor protein tyrosine kinases (NRTK), and identified a conserved DNA motif enriched within the 5'UTR introns of human NRTKs. CONCLUSIONS: Our results suggest that human 5'UTR introns enhance the expression of some genes in a length-dependent manner. While many 5'UTR introns are likely to be evolving neutrally, their relationship with gene expression and overrepresentation among regulatory genes, taken together, suggest that complex evolutionary forces are acting on this distinct class of introns.

Subject(s)

5' Untranslated Regions/genetics , Evolution, Molecular , Gene Expression Regulation/genetics , Genomics/methods , Introns/genetics , Gene Expression Profiling , Humans , Models, Genetic , Protein-Tyrosine Kinases/genetics

3.

Next generation software for functional trend analysis.

Berriz, Gabriel F; Beaver, John E; Cenik, Can; Tasan, Murat; Roth, Frederick P.

Bioinformatics ; 25(22): 3043-4, 2009 Nov 15.

Article in English | MEDLINE | ID: mdl-19717575

ABSTRACT

UNLABELLED: FuncAssociate is a web application that discovers properties enriched in lists of genes or proteins that emerge from large-scale experimentation. Here we describe an updated application with a new interface and several new features. For example, enrichment analysis can now be performed within multiple gene- and protein-naming systems. This feature avoids potentially serious translation artifacts to which other enrichment analysis strategies are subject. AVAILABILITY: The FuncAssociate web application is freely available to all users at http://llama.med.harvard.edu/funcassociate.

Subject(s)

Computational Biology/methods , Software , Databases, Factual , Proteins/chemistry , User-Computer Interface

4.

The Synergizer service for translating gene, protein and other biological identifiers.

Berriz, Gabriel F; Roth, Frederick P.

Bioinformatics ; 24(19): 2272-3, 2008 Oct 01.

Article in English | MEDLINE | ID: mdl-18697767

ABSTRACT

UNLABELLED: The Synergizer is a database and web service that provides translations of biological database identifiers. It is accessible both programmatically and interactively. AVAILABILITY: The Synergizer is freely available to all users inter-actively via a web application (http://llama.med.harvard.edu/synergizer/translate) and programmatically via a web service. Clients implementing the Synergizer application programming interface (API) are also freely available. Please visit http://llama.med.harvard.edu/synergizer/doc for details.

Subject(s)

Databases, Factual , Databases, Genetic , Databases, Protein , Algorithms , Information Storage and Retrieval , Proteins/chemistry , Proteins/genetics

5.

A critical assessment of Mus musculus gene function prediction using integrated genomic evidence.

Peña-Castillo, Lourdes; Tasan, Murat; Myers, Chad L; Lee, Hyunju; Joshi, Trupti; Zhang, Chao; Guan, Yuanfang; Leone, Michele; Pagnani, Andrea; Kim, Wan Kyu; Krumpelman, Chase; Tian, Weidong; Obozinski, Guillaume; Qi, Yanjun; Mostafavi, Sara; Lin, Guan Ning; Berriz, Gabriel F; Gibbons, Francis D; Lanckriet, Gert; Qiu, Jian; Grant, Charles; Barutcuoglu, Zafer; Hill, David P; Warde-Farley, David; Grouios, Chris; Ray, Debajyoti; Blake, Judith A; Deng, Minghua; Jordan, Michael I; Noble, William S; Morris, Quaid; Klein-Seetharaman, Judith; Bar-Joseph, Ziv; Chen, Ting; Sun, Fengzhu; Troyanskaya, Olga G; Marcotte, Edward M; Xu, Dong; Hughes, Timothy R; Roth, Frederick P.

Genome Biol ; 9 Suppl 1: S2, 2008.

Article in English | MEDLINE | ID: mdl-18613946

ABSTRACT

BACKGROUND: Several years after sequencing the human genome and the mouse genome, much remains to be discovered about the functions of most human and mouse genes. Computational prediction of gene function promises to help focus limited experimental resources on the most likely hypotheses. Several algorithms using diverse genomic data have been applied to this task in model organisms; however, the performance of such approaches in mammals has not yet been evaluated. RESULTS: In this study, a standardized collection of mouse functional genomic data was assembled; nine bioinformatics teams used this data set to independently train classifiers and generate predictions of function, as defined by Gene Ontology (GO) terms, for 21,603 mouse genes; and the best performing submissions were combined in a single set of predictions. We identified strengths and weaknesses of current functional genomic data sets and compared the performance of function prediction algorithms. This analysis inferred functions for 76% of mouse genes, including 5,000 currently uncharacterized genes. At a recall rate of 20%, a unified set of predictions averaged 41% precision, with 26% of GO terms achieving a precision better than 90%. CONCLUSION: We performed a systematic evaluation of diverse, independently developed computational approaches for predicting gene function from heterogeneous data sources in mammals. The results show that currently available data for mammals allows predictions with both breadth and accuracy. Importantly, many highly novel predictions emerge for the 38% of mouse genes that remain uncharacterized.

Subject(s)

Algorithms , Mice/genetics , Proteins/genetics , Proteins/metabolism , Animals , Mice/metabolism

6.

Metabolomic identification of novel biomarkers of myocardial ischemia.

Sabatine, Marc S; Liu, Emerson; Morrow, David A; Heller, Eric; McCarroll, Robert; Wiegand, Roger; Berriz, Gabriel F; Roth, Frederick P; Gerszten, Robert E.

Circulation ; 112(25): 3868-75, 2005 Dec 20.

Article in English | MEDLINE | ID: mdl-16344383

ABSTRACT

BACKGROUND: Recognition of myocardial ischemia is critical both for the diagnosis of coronary artery disease and the selection and evaluation of therapy. Recent advances in proteomic and metabolic profiling technologies may offer the possibility of identifying novel biomarkers and pathways activated in myocardial ischemia. METHODS AND RESULTS: Blood samples were obtained before and after exercise stress testing from 36 patients, 18 of whom demonstrated inducible ischemia (cases) and 18 of whom did not (controls). Plasma was fractionated by liquid chromatography, and profiling of analytes was performed with a high-sensitivity electrospray triple-quadrupole mass spectrometer under selected reaction monitoring conditions. Lactic acid and metabolites involved in skeletal muscle AMP catabolism increased after exercise in both cases and controls. In contrast, there was significant discordant regulation of multiple metabolites that either increased or decreased in cases but remained unchanged in controls. Functional pathway trend analysis with the use of novel software revealed that 6 members of the citric acid pathway were among the 23 most changed metabolites in cases (adjusted P=0.04). Furthermore, changes in 6 metabolites, including citric acid, differentiated cases from controls with a high degree of accuracy (P<0.0001; cross-validated c-statistic=0.83). CONCLUSIONS: We report the novel application of metabolomics to acute myocardial ischemia, in which we identified novel biomarkers of ischemia, and from pathway trend analysis, coordinate changes in groups of functionally related metabolites.

Subject(s)

Metabolism/physiology , Myocardial Ischemia/diagnosis , Adenosine Monophosphate/metabolism , Aged , Biomarkers/blood , Biomarkers/metabolism , Case-Control Studies , Chromatography, High Pressure Liquid , Citric Acid/metabolism , Exercise Test , Female , Humans , Lactic Acid/blood , Male , Middle Aged , Muscle, Skeletal/metabolism , Risk , Spectrometry, Mass, Electrospray Ionization

7.

Towards a proteome-scale map of the human protein-protein interaction network.

Rual, Jean-François; Venkatesan, Kavitha; Hao, Tong; Hirozane-Kishikawa, Tomoko; Dricot, Amélie; Li, Ning; Berriz, Gabriel F; Gibbons, Francis D; Dreze, Matija; Ayivi-Guedehoussou, Nono; Klitgord, Niels; Simon, Christophe; Boxem, Mike; Milstein, Stuart; Rosenberg, Jennifer; Goldberg, Debra S; Zhang, Lan V; Wong, Sharyl L; Franklin, Giovanni; Li, Siming; Albala, Joanna S; Lim, Janghoo; Fraughton, Carlene; Llamosas, Estelle; Cevik, Sebiha; Bex, Camille; Lamesch, Philippe; Sikorski, Robert S; Vandenhaute, Jean; Zoghbi, Huda Y; Smolyar, Alex; Bosak, Stephanie; Sequerra, Reynaldo; Doucette-Stamm, Lynn; Cusick, Michael E; Hill, David E; Roth, Frederick P; Vidal, Marc.

Nature ; 437(7062): 1173-8, 2005 Oct 20.

Article in English | MEDLINE | ID: mdl-16189514

ABSTRACT

Systematic mapping of protein-protein interactions, or 'interactome' mapping, was initiated in model organisms, starting with defined biological processes and then expanding to the scale of the proteome. Although far from complete, such maps have revealed global topological and dynamic features of interactome networks that relate to known biological properties, suggesting that a human interactome map will provide insight into development and disease mechanisms at a systems level. Here we describe an initial version of a proteome-scale map of human binary protein-protein interactions. Using a stringent, high-throughput yeast two-hybrid system, we tested pairwise interactions among the products of approximately 8,100 currently available Gateway-cloned open reading frames and detected approximately 2,800 interactions. This data set, called CCSB-HI1, has a verification rate of approximately 78% as revealed by an independent co-affinity purification assay, and correlates significantly with other biological attributes. The CCSB-HI1 data set increases by approximately 70% the set of available binary interactions within the tested space and reveals more than 300 new connections to over 100 disease-associated proteins. This work represents an important step towards a systematic and comprehensive human interactome project.

Subject(s)

Proteome/metabolism , Cloning, Molecular , Humans , Open Reading Frames/genetics , Protein Binding , Proteome/genetics , RNA/genetics , RNA/metabolism , Saccharomyces cerevisiae/genetics , Two-Hybrid System Techniques

8.

Predictive models of molecular machines involved in Caenorhabditis elegans early embryogenesis.

Gunsalus, Kristin C; Ge, Hui; Schetter, Aaron J; Goldberg, Debra S; Han, Jing-Dong J; Hao, Tong; Berriz, Gabriel F; Bertin, Nicolas; Huang, Jerry; Chuang, Ling-Shiang; Li, Ning; Mani, Ramamurthy; Hyman, Anthony A; Sönnichsen, Birte; Echeverri, Christophe J; Roth, Frederick P; Vidal, Marc; Piano, Fabio.

Nature ; 436(7052): 861-5, 2005 Aug 11.

Article in English | MEDLINE | ID: mdl-16094371

ABSTRACT

Although numerous fundamental aspects of development have been uncovered through the study of individual genes and proteins, system-level models are still missing for most developmental processes. The first two cell divisions of Caenorhabditis elegans embryogenesis constitute an ideal test bed for a system-level approach. Early embryogenesis, including processes such as cell division and establishment of cellular polarity, is readily amenable to large-scale functional analysis. A first step toward a system-level understanding is to provide 'first-draft' models both of the molecular assemblies involved and of the functional connections between them. Here we show that such models can be derived from an integrated gene/protein network generated from three different types of functional relationship: protein interaction, expression profiling similarity and phenotypic profiling similarity, as estimated from detailed early embryonic RNA interference phenotypes systematically recorded for hundreds of early embryogenesis genes. The topology of the integrated network suggests that C. elegans early embryogenesis is achieved through coordination of a limited set of molecular machines. We assessed the overall predictive value of such molecular machine models by dynamic localization of ten previously uncharacterized proteins within the living embryo.

Subject(s)

Caenorhabditis elegans/embryology , Caenorhabditis elegans/metabolism , Embryonic Development , Models, Biological , Systems Biology/methods , Algorithms , Animals , Caenorhabditis elegans/cytology , Caenorhabditis elegans/genetics , Cell Division , Cell Polarity , Embryonic Development/genetics , Gene Expression Profiling , Gene Expression Regulation, Developmental , Phenotype , Protein Binding , RNA Interference , Recombinant Fusion Proteins/genetics , Recombinant Fusion Proteins/metabolism

9.

Evidence for dynamically organized modularity in the yeast protein-protein interaction network.

Han, Jing-Dong J; Bertin, Nicolas; Hao, Tong; Goldberg, Debra S; Berriz, Gabriel F; Zhang, Lan V; Dupuy, Denis; Walhout, Albertha J M; Cusick, Michael E; Roth, Frederick P; Vidal, Marc.

Nature ; 430(6995): 88-93, 2004 Jul 01.

Article in English | MEDLINE | ID: mdl-15190252

ABSTRACT

In apparently scale-free protein-protein interaction networks, or 'interactome' networks, most proteins interact with few partners, whereas a small but significant proportion of proteins, the 'hubs', interact with many partners. Both biological and non-biological scale-free networks are particularly resistant to random node removal but are extremely sensitive to the targeted removal of hubs. A link between the potential scale-free topology of interactome networks and genetic robustness seems to exist, because knockouts of yeast genes encoding hubs are approximately threefold more likely to confer lethality than those of non-hubs. Here we investigate how hubs might contribute to robustness and other cellular properties for protein-protein interactions dynamically regulated both in time and in space. We uncovered two types of hub: 'party' hubs, which interact with most of their partners simultaneously, and 'date' hubs, which bind their different partners at different times or locations. Both in silico studies of network connectivity and genetic interactions described in vivo support a model of organized modularity in which date hubs organize the proteome, connecting biological processes--or modules--to each other, whereas party hubs function inside modules.

Subject(s)

Fungal Proteins/metabolism , Models, Biological , Yeasts/metabolism , Computer Simulation , Fungal Proteins/genetics , Genes, Fungal/genetics , Protein Binding , Yeasts/genetics

10.

Global mapping of the yeast genetic interaction network.

Tong, Amy Hin Yan; Lesage, Guillaume; Bader, Gary D; Ding, Huiming; Xu, Hong; Xin, Xiaofeng; Young, James; Berriz, Gabriel F; Brost, Renee L; Chang, Michael; Chen, YiQun; Cheng, Xin; Chua, Gordon; Friesen, Helena; Goldberg, Debra S; Haynes, Jennifer; Humphries, Christine; He, Grace; Hussein, Shamiza; Ke, Lizhu; Krogan, Nevan; Li, Zhijian; Levinson, Joshua N; Lu, Hong; Ménard, Patrice; Munyana, Christella; Parsons, Ainslie B; Ryan, Owen; Tonikian, Raffi; Roberts, Tania; Sdicu, Anne-Marie; Shapiro, Jesse; Sheikh, Bilal; Suter, Bernhard; Wong, Sharyl L; Zhang, Lan V; Zhu, Hongwei; Burd, Christopher G; Munro, Sean; Sander, Chris; Rine, Jasper; Greenblatt, Jack; Peter, Matthias; Bretscher, Anthony; Bell, Graham; Roth, Frederick P; Brown, Grant W; Andrews, Brenda; Bussey, Howard; Boone, Charles.

Science ; 303(5659): 808-13, 2004 Feb 06.

Article in English | MEDLINE | ID: mdl-14764870

ABSTRACT

A genetic interaction network containing approximately 1000 genes and approximately 4000 interactions was mapped by crossing mutations in 132 different query genes into a set of approximately 4700 viable gene yeast deletion mutants and scoring the double mutant progeny for fitness defects. Network connectivity was predictive of function because interactions often occurred among functionally related genes, and similar patterns of interactions tended to identify components of the same pathway. The genetic network exhibited dense local neighborhoods; therefore, the position of a gene on a partially mapped network is predictive of other genetic interactions. Because digenic interactions are common in yeast, similar networks may underlie the complex genetics associated with inherited phenotypes in other organisms.

Subject(s)

Genes, Fungal , Saccharomyces cerevisiae Proteins/metabolism , Saccharomyces cerevisiae/genetics , Saccharomyces cerevisiae/metabolism , Amino Acid Sequence , Computational Biology , Cystic Fibrosis/genetics , Gene Deletion , Genes, Essential , Genetic Diseases, Inborn/genetics , Genotype , Humans , Molecular Sequence Data , Multifactorial Inheritance , Mutation , Phenotype , Polymorphism, Genetic , Retinitis Pigmentosa/genetics , Saccharomyces cerevisiae Proteins/chemistry , Saccharomyces cerevisiae Proteins/genetics

11.

A map of the interactome network of the metazoan C. elegans.

Li, Siming; Armstrong, Christopher M; Bertin, Nicolas; Ge, Hui; Milstein, Stuart; Boxem, Mike; Vidalain, Pierre-Olivier; Han, Jing-Dong J; Chesneau, Alban; Hao, Tong; Goldberg, Debra S; Li, Ning; Martinez, Monica; Rual, Jean-François; Lamesch, Philippe; Xu, Lai; Tewari, Muneesh; Wong, Sharyl L; Zhang, Lan V; Berriz, Gabriel F; Jacotot, Laurent; Vaglio, Philippe; Reboul, Jérôme; Hirozane-Kishikawa, Tomoko; Li, Qianru; Gabel, Harrison W; Elewa, Ahmed; Baumgartner, Bridget; Rose, Debra J; Yu, Haiyuan; Bosak, Stephanie; Sequerra, Reynaldo; Fraser, Andrew; Mango, Susan E; Saxton, William M; Strome, Susan; Van Den Heuvel, Sander; Piano, Fabio; Vandenhaute, Jean; Sardet, Claude; Gerstein, Mark; Doucette-Stamm, Lynn; Gunsalus, Kristin C; Harper, J Wade; Cusick, Michael E; Roth, Frederick P; Hill, David E; Vidal, Marc.

Science ; 303(5657): 540-3, 2004 Jan 23.

Article in English | MEDLINE | ID: mdl-14704431

ABSTRACT

To initiate studies on how protein-protein interaction (or "interactome") networks relate to multicellular functions, we have mapped a large fraction of the Caenorhabditis elegans interactome network. Starting with a subset of metazoan-specific proteins, more than 4000 interactions were identified from high-throughput, yeast two-hybrid (HT=Y2H) screens. Independent coaffinity purification assays experimentally validated the overall quality of this Y2H data set. Together with already described Y2H interactions and interologs predicted in silico, the current version of the Worm Interactome (WI5) map contains approximately 5500 interactions. Topological and biological features of this interactome network, as well as its integration with phenome and transcriptome data sets, lead to numerous biological hypotheses.

Subject(s)

Caenorhabditis elegans Proteins/metabolism , Caenorhabditis elegans/metabolism , Proteome/metabolism , Animals , Caenorhabditis elegans/genetics , Caenorhabditis elegans Proteins/genetics , Computational Biology , Evolution, Molecular , Genes, Helminth , Genomics , Open Reading Frames , Phenotype , Protein Binding , Transcription, Genetic , Two-Hybrid System Techniques

12.

Characterizing gene sets with FuncAssociate.

Berriz, Gabriel F; King, Oliver D; Bryant, Barbara; Sander, Chris; Roth, Frederick P.

Bioinformatics ; 19(18): 2502-4, 2003 Dec 12.

Article in English | MEDLINE | ID: mdl-14668247

ABSTRACT

SUMMARY: FuncAssociate is a web-based tool to help researchers use Gene Ontology attributes to characterize large sets of genes derived from experiment. Distinguishing features of FuncAssociate include the ability to handle ranked input lists, and a Monte Carlo simulation approach that is more appropriate to determine significance than other methods, such as Bonferroni or idák p-value correction. FuncAssociate currently supports 10 organisms (Vibrio cholerae, Shewanella oneidensis, Saccharomyces cerevisiae, Schizosaccharomyces pombe, Arabidopsis thaliana, Caenorhaebditis elegans, Drosophila melanogaster, Mus musculus, Rattus norvegicus and Homo sapiens). AVAILABILITY: FuncAssociate is freely accessible at http://llama.med.harvard.edu/Software.html. Source code (in Perl and C) is freely available to academic users 'as is'.

Subject(s)

Algorithms , Databases, Protein , Gene Expression Profiling/methods , Natural Language Processing , Proteins/genetics , Proteins/metabolism , Software , Animals , Database Management Systems , Information Storage and Retrieval/methods , Proteins/chemistry , Proteins/classification , Rats , Reproducibility of Results , Sensitivity and Specificity , Sequence Analysis, Protein/methods , Terminology as Topic

13.

GoFish finds genes with combinations of Gene Ontology attributes.

Berriz, Gabriel F; White, James V; King, Oliver D; Roth, Frederick P.

Bioinformatics ; 19(6): 788-9, 2003 Apr 12.

Article in English | MEDLINE | ID: mdl-12691998

ABSTRACT

SUMMARY: GoFish is a Java application that allows users to search for gene products with particular gene ontology (GO) attributes, or combinations of attributes. GoFish ranks gene products by the degree to which they satisfy a Boolean query. Four organisms are currently supported: Saccaromyces cerevisiae, Caenorhabditis elegans, Drosophila melanogaster, and M.musculus.

Subject(s)

Gene Expression Profiling/methods , Hypermedia , Phylogeny , Sequence Alignment/methods , Sequence Analysis, Protein/methods , Software , User-Computer Interface , Amino Acid Sequence , Molecular Sequence Data , Proteomics/methods

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL