Search | VHL Regional Portal

1.

Communicating computational workflows in a regulatory environment.

Keeney, Jonathon G; Gulzar, Naila; Baker, Jack B; Klempir, Ondrej; Hannigan, Geoffrey D; Bitton, Danny A; Maritz, Julia M; King, Charles H S; Patel, Janisha A; Duncan, Paul; Mazumder, Raja.

Drug Discov Today ; 29(3): 103884, 2024 Mar.

Article in English | MEDLINE | ID: mdl-38219969

ABSTRACT

The volume of nucleic acid sequence data has exploded recently, amplifying the challenge of transforming data into meaningful information. Processing data can require an increasingly complex ecosystem of customized tools, which increases difficulty in communicating analyses in an understandable way yet is of sufficient detail to enable informed decisions or repeats. This can be of particular interest to institutions and companies communicating computations in a regulatory environment. BioCompute Objects (BCOs; an instance of pipeline documentation that conforms to the IEEE 2791-2020 standard) were developed as a standardized mechanism for analysis reporting. A suite of BCOs is presented, representing interconnected elements of a computation modeled after those that might be found in a regulatory submission but are shared publicly - in this case a pipeline designed to identify viral contaminants in biological manufacturing, such as for vaccines.

Subject(s)

Computational Biology , Vaccines , High-Throughput Nucleotide Sequencing , Workflow

2.

Communicating regulatory high-throughput sequencing data using BioCompute Objects.

King, Charles Hadley S; Keeney, Jonathon; Guimera, Nuria; Das, Souvik; Weber, Michiel; Fochtman, Brian; Walderhaug, Mark O; Talwar, Sneh; Patel, Janisha A; Mazumder, Raja; Donaldson, Eric F.

Drug Discov Today ; 27(4): 1108-1114, 2022 04.

Article in English | MEDLINE | ID: mdl-35077912

ABSTRACT

This project demonstrates the use of the IEEE 2791-2020 Standard (BioCompute Objects [BCO]) to enable the complete and concise communication of results from next generation sequencing (NGS) analysis. One arm of a clinical trial was replicated using synthetically generated data made to resemble real biological data and then two independent analyses were performed. The first simulated a pharmaceutical regulatory submission to the US Food and Drug Administration (FDA) including analysis of results and a BCO. The second simulated an FDA review that included an independent analysis of the submitted data. Of the 118 simulated patient samples generated, 117 (99.15%) were in agreement in the two analyses. This process exemplifies how a template BCO (tBCO), including a verification kit, facilitates transparency and reproducibility, thereby reinforcing confidence in the regulatory submission process.

Subject(s)

High-Throughput Nucleotide Sequencing , Humans , Pharmaceutical Preparations , Reproducibility of Results , United States , United States Food and Drug Administration

3.

Bioinformatics tools developed to support BioCompute Objects.

Patel, Janisha A; Dean, Dennis A; King, Charles Hadley; Xiao, Nan; Koc, Soner; Minina, Ekaterina; Golikov, Anton; Brooks, Phillip; Kahsay, Robel; Navelkar, Rahi; Ray, Manisha; Roberson, Dave; Armstrong, Chris; Mazumder, Raja; Keeney, Jonathon.

Database (Oxford) ; 20212021 03 30.

Article in English | MEDLINE | ID: mdl-33784373

ABSTRACT

Developments in high-throughput sequencing (HTS) result in an exponential increase in the amount of data generated by sequencing experiments, an increase in the complexity of bioinformatics analysis reporting and an increase in the types of data generated. These increases in volume, diversity and complexity of the data generated and their analysis expose the necessity of a structured and standardized reporting template. BioCompute Objects (BCOs) provide the requisite support for communication of HTS data analysis that includes support for workflow, as well as data, curation, accessibility and reproducibility of communication. BCOs standardize how researchers report provenance and the established verification and validation protocols used in workflows while also being robust enough to convey content integration or curation in knowledge bases. BCOs that encapsulate tools, platforms, datasets and workflows are FAIR (findable, accessible, interoperable and reusable) compliant. Providing operational workflow and data information facilitates interoperability between platforms and incorporation of future dataset within an HTS analysis for use within industrial, academic and regulatory settings. Cloud-based platforms, including High-performance Integrated Virtual Environment (HIVE), Cancer Genomics Cloud (CGC) and Galaxy, support BCO generation for users. Given the 100K+ userbase between these platforms, BioCompute can be leveraged for workflow documentation. In this paper, we report the availability of platform-dependent and platform-independent BCO tools: HIVE BCO App, CGC BCO App, Galaxy BCO API Extension and BCO Portal. Community engagement was utilized to evaluate tool efficacy. We demonstrate that these tools further advance BCO creation from text editing approaches used in earlier releases of the standard. Moreover, we demonstrate that integrating BCO generation within existing analysis platforms greatly streamlines BCO creation while capturing granular workflow details. We also demonstrate that the BCO tools described in the paper provide an approach to solve the long-standing challenge of standardizing workflow descriptions that are both human and machine readable while accommodating manual and automated curation with evidence tagging. Database URL: https://www.biocomputeobject.org/resources.

Subject(s)

Computational Biology , Genomics , High-Throughput Nucleotide Sequencing , Humans , Reproducibility of Results , Software , Workflow

4.

Recent Advances in Systems and Network Medicine: Meeting Report from the First International Conference in Systems and Network Medicine.

Kurnat-Thoma, Emma; Baranova, Ancha; Baird, Pat; Brodsky, Elia; Butte, Atul J; Cheema, Amrita K; Cheng, Feixiong; Dutta, Shuchismita; Grant, Christina; Giordano, James; Maitland-van der Zee, Anke H; Fridsma, Douglas B; Jarrin, Robert; Kann, Maricel G; Keeney, Jonathon; Loscalzo, Joseph; Madhavan, Guru; Maron, Bradley A; McBride, Dennis K; McKean, Maeve; Mun, Seong K; Palmer, James C; Patel, Bakul; Parakh, Kapil; Pariser, Anne R; Pristipino, Christian; Radstake, Timothy R D J; Rajasimha, Harsha K; Rouse, William B; Rozman, Damjana; Saleh, Alif; Schmidt, Harald H H W; Schultz, Nikolaus; Sethi, Tavpritesh; Silverman, Edwin K; Skopac, Jessica; Svab, Igor; Trujillo, Sylvia; Valentine, James E; Verma, Dinesh; West, Bruce J; Vasudevan, Sona.

Syst Med (New Rochelle) ; 3(1): 22-35, 2020.

Article in English | MEDLINE | ID: mdl-32226924

ABSTRACT

The First International Conference in Systems and Network Medicine gathered together 200 global thought leaders, scientists, clinicians, academicians, industry and government experts, medical and graduate students, postdoctoral scholars and policymakers. Held at Georgetown University Conference Center in Washington D.C. on September 11-13, 2019, the event featured a day of pre-conference lectures and hands-on bioinformatic computational workshops followed by two days of deep and diverse scientific talks, panel discussions with eminent thought leaders, and scientific poster presentations. Topics ranged from: Systems and Network Medicine in Clinical Practice; the role of -omics technologies in Health Care; the role of Education and Ethics in Clinical Practice, Systems Thinking, and Rare Diseases; and the role of Artificial Intelligence in Medicine. The conference served as a unique nexus for interdisciplinary discovery and dialogue and fostered formation of new insights and possibilities for health care systems advances.

5.

OncoMX: A Knowledgebase for Exploring Cancer Biomarkers in the Context of Related Cancer and Healthy Data.

Dingerdissen, Hayley M; Bastian, Frederic; Vijay-Shanker, K; Robinson-Rechavi, Marc; Bell, Amanda; Gogate, Nikhita; Gupta, Samir; Holmes, Evan; Kahsay, Robel; Keeney, Jonathon; Kincaid, Heather; King, Charles Hadley; Liu, David; Crichton, Daniel J; Mazumder, Raja.

JCO Clin Cancer Inform ; 4: 210-220, 2020 03.

Article in English | MEDLINE | ID: mdl-32142370

ABSTRACT

PURPOSE: The purpose of OncoMX1 knowledgebase development was to integrate cancer biomarker and relevant data types into a meta-portal, enabling the research of cancer biomarkers side by side with other pertinent multidimensional data types. METHODS: Cancer mutation, cancer differential expression, cancer expression specificity, healthy gene expression from human and mouse, literature mining for cancer mutation and cancer expression, and biomarker data were integrated, unified by relevant biomedical ontologies, and subjected to rule-based automated quality control before ingestion into the database. RESULTS: OncoMX provides integrated data encompassing more than 1,000 unique biomarker entries (939 from the Early Detection Research Network [EDRN] and 96 from the US Food and Drug Administration) mapped to 20,576 genes that have either mutation or differential expression in cancer. Sentences reporting mutation or differential expression in cancer were extracted from more than 40,000 publications, and healthy gene expression data with samples mapped to organs are available for both human genes and their mouse orthologs. CONCLUSION: OncoMX has prioritized user feedback as a means of guiding development priorities. By mapping to and integrating data from several cancer genomics resources, it is hoped that OncoMX will foster a dynamic engagement between bioinformaticians and cancer biomarker researchers. This engagement should culminate in a community resource that substantially improves the ability and efficiency of exploring cancer biomarker data and related multidimensional data.

Subject(s)

Biomarkers, Tumor/analysis , Computational Biology/methods , Data Mining/methods , Databases, Genetic/standards , Knowledge Bases , Neoplasms/diagnosis , Software , Animals , Biological Ontologies , Humans , Mice , Neoplasms/therapy , User-Computer Interface

6.

Enabling precision medicine via standard communication of HTS provenance, analysis, and results.

Alterovitz, Gil; Dean, Dennis; Goble, Carole; Crusoe, Michael R; Soiland-Reyes, Stian; Bell, Amanda; Hayes, Anais; Suresh, Anita; Purkayastha, Anjan; King, Charles H; Taylor, Dan; Johanson, Elaine; Thompson, Elaine E; Donaldson, Eric; Morizono, Hiroki; Tsang, Hsinyi; Vora, Jeet K; Goecks, Jeremy; Yao, Jianchao; Almeida, Jonas S; Keeney, Jonathon; Addepalli, KanakaDurga; Krampis, Konstantinos; Smith, Krista M; Guo, Lydia; Walderhaug, Mark; Schito, Marco; Ezewudo, Matthew; Guimera, Nuria; Walsh, Paul; Kahsay, Robel; Gottipati, Srikanth; Rodwell, Timothy C; Bloom, Toby; Lai, Yuching; Simonyan, Vahan; Mazumder, Raja.

PLoS Biol ; 16(12): e3000099, 2018 12.

Article in English | MEDLINE | ID: mdl-30596645

ABSTRACT

A personalized approach based on a patient's or pathogen's unique genomic sequence is the foundation of precision medicine. Genomic findings must be robust and reproducible, and experimental data capture should adhere to findable, accessible, interoperable, and reusable (FAIR) guiding principles. Moreover, effective precision medicine requires standardized reporting that extends beyond wet-lab procedures to computational methods. The BioCompute framework (https://w3id.org/biocompute/1.3.0) enables standardized reporting of genomic sequence data provenance, including provenance domain, usability domain, execution domain, verification kit, and error domain. This framework facilitates communication and promotes interoperability. Bioinformatics computation instances that employ the BioCompute framework are easily relayed, repeated if needed, and compared by scientists, regulators, test developers, and clinicians. Easing the burden of performing the aforementioned tasks greatly extends the range of practical application. Large clinical trials, precision medicine, and regulatory submissions require a set of agreed upon standards that ensures efficient communication and documentation of genomic analyses. The BioCompute paradigm and the resulting BioCompute Objects (BCOs) offer that standard and are freely accessible as a GitHub organization (https://github.com/biocompute-objects) following the "Open-Stand.org principles for collaborative open standards development." With high-throughput sequencing (HTS) studies communicated using a BCO, regulatory agencies (e.g., Food and Drug Administration [FDA]), diagnostic test developers, researchers, and clinicians can expand collaboration to drive innovation in precision medicine, potentially decreasing the time and cost associated with next-generation sequencing workflow exchange, reporting, and regulatory reviews.

Subject(s)

Computational Biology/methods , Sequence Analysis, DNA/methods , Animals , Communication , Computational Biology/standards , Genome , Genomics/methods , High-Throughput Nucleotide Sequencing , Humans , Precision Medicine/trends , Reproducibility of Results , Sequence Analysis, DNA/standards , Software , Workflow

7.

DUF1220 copy number is linearly associated with increased cognitive function as measured by total IQ and mathematical aptitude scores.

Davis, Jonathon M; Searles, Veronica B; Anderson, Nathan; Keeney, Jonathon; Raznahan, Armin; Horwood, L John; Fergusson, David M; Kennedy, Martin A; Giedd, Jay; Sikela, James M.

Hum Genet ; 134(1): 67-75, 2015 Jan.

Article in English | MEDLINE | ID: mdl-25287832

ABSTRACT

DUF1220 protein domains exhibit the greatest human lineage-specific copy number expansion of any protein-coding sequence in the genome, and variation in DUF1220 copy number has been linked to both brain size in humans and brain evolution among primates. Given these findings, we examined associations between DUF1220 subtypes CON1 and CON2 and cognitive aptitude. We identified a linear association between CON2 copy number and cognitive function in two independent populations of European descent. In North American males, an increase in CON2 copy number corresponded with an increase in WISC IQ (R (2) = 0.13, p = 0.02), which may be driven by males aged 6-11 (R (2) = 0.42, p = 0.003). We utilized ddPCR in a subset as a confirmatory measurement. This group had 26-33 copies of CON2 with a mean of 29, and each copy increase of CON2 was associated with a 3.3-point increase in WISC IQ (R (2) = 0.22, p = 0.045). In individuals from New Zealand, an increase in CON2 copy number was associated with an increase in math aptitude ability (R (2) = 0.10 p = 0.018). These were not confounded by brain size. To our knowledge, this is the first study to report a replicated association between copy number of a gene coding sequence and cognitive aptitude. Remarkably, dosage variations involving DUF1220 sequences have now been linked to human brain expansion, autism severity and cognitive aptitude, suggesting that such processes may be genetically and mechanistically inter-related. The findings presented here warrant expanded investigations in larger, well-characterized cohorts.

Subject(s)

Aptitude/physiology , Brain/metabolism , Carrier Proteins/genetics , Chromosomes, Human, Pair 1/genetics , Cognition/physiology , DNA Copy Number Variations/genetics , Intelligence/physiology , Adolescent , Adult , Brain/pathology , Child , Comparative Genomic Hybridization/methods , Female , Follow-Up Studies , Humans , Male , Mathematics , Organ Size , Polymerase Chain Reaction/methods , Protein Structure, Tertiary , Young Adult

8.

The case for DUF1220 domain dosage as a primary contributor to anthropoid brain expansion.

Keeney, Jonathon G; Dumas, Laura; Sikela, James M.

Front Hum Neurosci ; 8: 427, 2014.

Article in English | MEDLINE | ID: mdl-25009482

ABSTRACT

Here we present the hypothesis that increasing copy number (dosage) of sequences encoding DUF1220 protein domains is a major contributor to the evolutionary increase in brain size, neuron number, and cognitive capacity that is associated with the primate order. We further propose that this relationship is restricted to the anthropoid sub-order of primates, with DUF1220 copy number markedly increasing in monkeys, further in apes, and most extremely in humans where the greatest number of copies (~272 haploid copies) is found. We show that this increase closely parallels the increase in brain size and neuron number that has occurred among anthropoid primate species. We also provide evidence linking DUF1220 copy number to brain size within the human species, both in normal populations and in individuals associated with brain size pathologies (1q21-associated microcephaly and macrocephaly). While we believe these and other findings presented here strongly suggest increase in DUF1220 copy number is a key contributor to anthropoid brain expansion, the data currently available rely largely on correlative measures that, though considerable, do not yet provide direct evidence for a causal connection. Nevertheless, we believe the evidence presented is sufficient to provide the basis for a testable model which proposes that DUF1220 protein domain dosage increase is a main contributor to the increase in brain size and neuron number found among the anthropoid primate species and that is at its most extreme in human.

9.

DUF1220 dosage is linearly associated with increasing severity of the three primary symptoms of autism.

Davis, Jonathan M; Searles, Veronica B; Anderson, Nathan; Keeney, Jonathon; Dumas, Laura; Sikela, James M.

PLoS Genet ; 10(3): e1004241, 2014 Mar.

Article in English | MEDLINE | ID: mdl-24651471

ABSTRACT

One of the three most frequently documented copy number variations associated with autism spectrum disorder (ASD) is a 1q21.1 duplication that encompasses sequences encoding DUF1220 protein domains, the dosage of which we previously implicated in increased human brain size. Further, individuals with ASD frequently display accelerated brain growth and a larger brain size that is also associated with increased symptom severity. Given these findings, we investigated the relationship between DUF1220 copy number and ASD severity, and here show that in individuals with ASD (n = 170), the copy number (dosage) of DUF1220 subtype CON1 is highly variable, ranging from 56 to 88 copies following a Gaussian distribution. More remarkably, in individuals with ASD CON1 copy number is also linearly associated, in a dose-response manner, with increased severity of each of the three primary symptoms of ASD: social deficits (p = 0.021), communicative impairments (p = 0.030), and repetitive behaviors (p = 0.047). These data indicate that DUF1220 protein domain (CON1) dosage has an ASD-wide effect and, as such, is likely to be a key component of a major pathway underlying ASD severity. Finally, these findings, by implicating the dosage of a previously unexamined, copy number polymorphic and brain evolution-related gene coding sequence in ASD severity, provide an important new direction for further research into the genetic factors underlying ASD.

Subject(s)

Autistic Disorder/genetics , DNA Copy Number Variations/genetics , Gene Dosage , Adolescent , Adult , Autistic Disorder/pathology , Brain , Child , Child, Preschool , Chromosomes, Human, Pair 1/genetics , Female , Gene Duplication , Genetic Predisposition to Disease , Humans , Infant , Male , Protein Structure, Tertiary

10.

Mode of genetic inheritance modifies the association of head circumference and autism-related symptoms: a cross-sectional study.

Davis, Jonathan M; Keeney, Jonathon G; Sikela, James M; Hepburn, Susan.

PLoS One ; 8(9): e74940, 2013.

Article in English | MEDLINE | ID: mdl-24058641

ABSTRACT

BACKGROUND: Frequently individuals with autism spectrum disorder (ASD) have been noted with a larger head circumference (HC) than their typical developing peers. Biologic hypotheses suggest that an overly rapid brain growth leads to the core symptoms of ASD by impairing connectivity. Literature is divided however where deleterious, protective and null associations of HC with ASD symptoms in individuals with ASD have been found. METHOD: Individuals (nâ=â1,416) from the Autism Genetic Resource Exchange with ASD were examined for associations of HC with ASD like symptoms. Mixed models controlling for sex, age, race/ethnicity, simplex/multiplex status and accounting for correlations between siblings were used. Interactions by simplex/multiplex were explored. Adjustments for height in a sub-population with available data were explored as well. RESULTS: A Significant interaction term (pâ=â0.03) suggested that the effect of HC was dependent on whether the individual was simplex or multiplex. In simplex individuals at mean age (8.9 years) 1 cm increase in head circumference was associated with a 24% increase in the odds of a high social diagnostic score from the Autism Diagnostic Interview-Revised (odds ratio â=â1.24, pâ=â0.01). There was no association in multiplex individuals. Additionally, individuals classified with a non-verbal IQ <70 were 90% simplex and had a significantly increased head circumference (0.7 cm pâ=â0.03) relative to a mid-range non-verbal IQ group. Interestingly, children classified with a >110 non-verbal IQ also had an increased HC (0.4 cm pâ=â0.04), relative to a mid-range non-verbal IQ group, and were 90% multiplex. HC effects do not appear to be confounded by height, however, larger samples with height information are needed. CONCLUSION: The potential link between brain growth and autism like symptoms is complex and could depend on specific etiologies. Further investigations accounting for a likely mode of inheritance will help identify an ASD subtype related to HC.

Subject(s)

Autistic Disorder/genetics , Autistic Disorder/pathology , Head/pathology , Models, Genetic , Adolescent , Adult , Child , Child, Preschool , Cross-Sectional Studies , Female , Humans , Infant , Male , Organ Size/genetics

11.

Impact of cigarette smoke exposure on innate immunity: a Caenorhabditis elegans model.

Green, Rebecca M; Gally, Fabienne; Keeney, Jonathon G; Alper, Scott; Gao, Bifeng; Han, Min; Martin, Richard J; Weinberger, Andrew R; Case, Stephanie R; Minor, Maisha N; Chu, Hong Wei.

PLoS One ; 4(8): e6860, 2009 Aug 31.

Article in English | MEDLINE | ID: mdl-19718433

ABSTRACT

BACKGROUND: Cigarette smoking is the major cause of chronic obstructive pulmonary disease (COPD) and lung cancer. Respiratory bacterial infections have been shown to be involved in the development of COPD along with impaired airway innate immunity. METHODOLOGY/PRINCIPAL FINDINGS: To address the in vivo impact of cigarette smoke (CS) exclusively on host innate defense mechanisms, we took advantage of Caenorhabditis elegans (C. elegans), which has an innate immune system but lacks adaptive immune function. Pseudomonas aeruginosa (PA) clearance from intestines of C. elegans was dampened by CS. Microarray analysis identified 6 candidate genes with a 2-fold or greater reduction after CS exposure, that have a human orthologue, and that may participate in innate immunity. To confirm a role of CS-down-regulated genes in the innate immune response to PA, RNA interference (RNAi) by feeding was carried out in C. elegans to inhibit the gene of interest, followed by PA infection to determine if the gene affected innate immunity. Inhibition of lbp-7, which encodes a lipid binding protein, resulted in increased levels of intestinal PA. Primary human bronchial epithelial cells were shown to express mRNA of human Fatty Acid Binding Protein 5 (FABP-5), the human orthologue of lpb-7. Interestingly, FABP-5 mRNA levels from human smokers with COPD were significantly lower (p = 0.036) than those from smokers without COPD. Furthermore, FABP-5 mRNA levels were up-regulated (7-fold) after bacterial (i.e., Mycoplasma pneumoniae) infection in primary human bronchial epithelial cell culture (air-liquid interface culture). CONCLUSIONS: Our results suggest that the C. elegans model offers a novel in vivo approach to specifically study innate immune deficiencies resulting from exposure to cigarette smoke, and that results from the nematode may provide insight into human airway epithelial cell biology and cigarette smoke exposure.

Subject(s)

Caenorhabditis elegans/immunology , Immunity, Innate , Models, Animal , Nicotiana , Smoke , Animals , Caenorhabditis elegans/microbiology , Cotinine/metabolism , Nicotine/metabolism , Pseudomonas aeruginosa/isolation & purification , RNA Interference

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL