Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 9 de 9
Filter
1.
PLoS One ; 10(8): e0135832, 2015.
Article in English | MEDLINE | ID: mdl-26285210

ABSTRACT

Various attempts have been made to predict the individual disease risk based on genotype data from genome-wide association studies (GWAS). However, most studies only investigated one or two classification algorithms and feature encoding schemes. In this study, we applied seven different classification algorithms on GWAS case-control data sets for seven different diseases to create models for disease risk prediction. Further, we used three different encoding schemes for the genotypes of single nucleotide polymorphisms (SNPs) and investigated their influence on the predictive performance of these models. Our study suggests that an additive encoding of the SNP data should be the preferred encoding scheme, as it proved to yield the best predictive performances for all algorithms and data sets. Furthermore, our results showed that the differences between most state-of-the-art classification algorithms are not statistically significant. Consequently, we recommend to prefer algorithms with simple models like the linear support vector machine (SVM) as they allow for better subsequent interpretation without significant loss of accuracy.


Subject(s)
Computational Biology/methods , Disease/genetics , Genome-Wide Association Study , Algorithms , Genotype , Humans , Polymorphism, Single Nucleotide , Risk Assessment , Statistics, Nonparametric , Support Vector Machine
2.
Bioinformatics ; 31(20): 3383-6, 2015 Oct 15.
Article in English | MEDLINE | ID: mdl-26079347

ABSTRACT

UNLABELLED: JSBML, the official pure Java programming library for the Systems Biology Markup Language (SBML) format, has evolved with the advent of different modeling formalisms in systems biology and their ability to be exchanged and represented via extensions of SBML. JSBML has matured into a major, active open-source project with contributions from a growing, international team of developers who not only maintain compatibility with SBML, but also drive steady improvements to the Java interface and promote ease-of-use with end users. AVAILABILITY AND IMPLEMENTATION: Source code, binaries and documentation for JSBML can be freely obtained under the terms of the LGPL 2.1 from the website http://sbml.org/Software/JSBML. More information about JSBML can be found in the user guide at http://sbml.org/Software/JSBML/docs/. CONTACT: jsbml-development@googlegroups.com or andraeger@eng.ucsd.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Models, Biological , Software , Systems Biology , Computer Simulation , Programming Languages
3.
PLoS One ; 8(10): e78577, 2013.
Article in English | MEDLINE | ID: mdl-24205270

ABSTRACT

Genome-wide association studies (GWAS) led to the identification of numerous novel loci for a number of complex diseases. Pathway-based approaches using genotypic data provide tangible leads which cannot be identified by single marker approaches as implemented in GWAS. The available pathway analysis approaches mainly differ in the employed databases and in the applied statistics for determining the significance of the associated disease markers. So far, pathway-based approaches using GWAS data failed to consider the overlapping of genes among different pathways or the influence of protein-interactions. We performed a multistage integrative pathway (MIP) analysis on three common diseases--Crohn's disease (CD), rheumatoid arthritis (RA) and type 1 diabetes (T1D)--incorporating genotypic, pathway, protein- and domain-interaction data to identify novel associations between these diseases and pathways. Additionally, we assessed the sensitivity of our method by studying the influence of the most significant SNPs on the pathway analysis by removing those and comparing the corresponding pathway analysis results. Apart from confirming many previously published associations between pathways and RA, CD and T1D, our MIP approach was able to identify three new associations between disease phenotypes and pathways. This includes a relation between the influenza-A pathway and RA, as well as a relation between T1D and the phagosome and toxoplasmosis pathways. These results provide new leads to understand the molecular underpinnings of these diseases. The developed software herein used is available at http://www.cogsys.cs.uni-tuebingen.de/software/GWASPathwayIdentifier/index.htm.


Subject(s)
Arthritis, Rheumatoid/genetics , Computational Biology/methods , Diabetes Mellitus, Type 1/genetics , Genome-Wide Association Study/methods , Crohn Disease/genetics , Genotype , Humans , Phenotype , Polymorphism, Single Nucleotide , Software
4.
BMC Syst Biol ; 7: 116, 2013 Nov 01.
Article in English | MEDLINE | ID: mdl-24180668

ABSTRACT

BACKGROUND: Systems biology projects and omics technologies have led to a growing number of biochemical pathway models and reconstructions. However, the majority of these models are still created de novo, based on literature mining and the manual processing of pathway data. RESULTS: To increase the efficiency of model creation, the Path2Models project has automatically generated mathematical models from pathway representations using a suite of freely available software. Data sources include KEGG, BioCarta, MetaCyc and SABIO-RK. Depending on the source data, three types of models are provided: kinetic, logical and constraint-based. Models from over 2 600 organisms are encoded consistently in SBML, and are made freely available through BioModels Database at http://www.ebi.ac.uk/biomodels-main/path2models. Each model contains the list of participants, their interactions, the relevant mathematical constructs, and initial parameter values. Most models are also available as easy-to-understand graphical SBGN maps. CONCLUSIONS: To date, the project has resulted in more than 140 000 freely available models. Such a resource can tremendously accelerate the development of mathematical models by providing initial starting models for simulation and analysis, which can be subsequently curated and further parameterized.


Subject(s)
Computer Simulation , Systems Biology/methods , Genomics , Humans , Kinetics , Metabolic Networks and Pathways , Software
5.
Hum Mol Genet ; 22(5): 1039-49, 2013 Mar 01.
Article in English | MEDLINE | ID: mdl-23223016

ABSTRACT

Parkinson's disease (PD) is the second most common neurodegenerative disease affecting 1-2% in people >60 and 3-4% in people >80. Genome-wide association (GWA) studies have now implicated significant evidence for association in at least 18 genomic regions. We have studied a large PD-meta analysis and identified a significant excess of SNPs (P < 1 × 10(-16)) that are associated with PD but fall short of the genome-wide significance threshold. This result was independent of variants at the 18 previously implicated regions and implies the presence of additional polygenic risk alleles. To understand how these loci increase risk of PD, we applied a pathway-based analysis, testing for biological functions that were significantly enriched for genes containing variants associated with PD. Analysing two independent GWA studies, we identified that both had a significant excess in the number of functional categories enriched for PD-associated genes (minimum P = 0.014 and P = 0.006, respectively). Moreover, 58 categories were significantly enriched for associated genes in both GWA studies (P < 0.001), implicating genes involved in the 'regulation of leucocyte/lymphocyte activity' and also 'cytokine-mediated signalling' as conferring an increased susceptibility to PD. These results were unaltered by the exclusion of all 178 genes that were present at the 18 genomic regions previously reported to be strongly associated with PD (including the HLA locus). Our findings, therefore, provide independent support to the strong association signal at the HLA locus and imply that the immune-related genetic susceptibility to PD is likely to be more widespread in the genome than previously appreciated.


Subject(s)
HLA Antigens/genetics , Metabolic Networks and Pathways , Parkinson Disease/genetics , Parkinson Disease/immunology , Alleles , Genetic Predisposition to Disease , Genome-Wide Association Study , Humans , Metabolic Networks and Pathways/genetics , Metabolic Networks and Pathways/immunology , Parkinson Disease/metabolism , Polymorphism, Single Nucleotide , Risk
6.
Hum Mol Genet ; 21(22): 4996-5009, 2012 Nov 15.
Article in English | MEDLINE | ID: mdl-22892372

ABSTRACT

Genome-wide association studies (GWASs) have been successful at identifying single-nucleotide polymorphisms (SNPs) highly associated with common traits; however, a great deal of the heritable variation associated with common traits remains unaccounted for within the genome. Genome-wide complex trait analysis (GCTA) is a statistical method that applies a linear mixed model to estimate phenotypic variance of complex traits explained by genome-wide SNPs, including those not associated with the trait in a GWAS. We applied GCTA to 8 cohorts containing 7096 case and 19 455 control individuals of European ancestry in order to examine the missing heritability present in Parkinson's disease (PD). We meta-analyzed our initial results to produce robust heritability estimates for PD types across cohorts. Our results identify 27% (95% CI 17-38, P = 8.08E - 08) phenotypic variance associated with all types of PD, 15% (95% CI -0.2 to 33, P = 0.09) phenotypic variance associated with early-onset PD and 31% (95% CI 17-44, P = 1.34E - 05) phenotypic variance associated with late-onset PD. This is a substantial increase from the genetic variance identified by top GWAS hits alone (between 3 and 5%) and indicates there are substantially more risk loci to be identified. Our results suggest that although GWASs are a useful tool in identifying the most common variants associated with complex disease, a great deal of common variants of small effect remain to be discovered.


Subject(s)
Genome-Wide Association Study , Multifactorial Inheritance , Parkinson Disease/genetics , Quantitative Trait, Heritable , Adult , Aged , Aged, 80 and over , Female , Genetic Predisposition to Disease , Genetic Variation , Humans , Male , Middle Aged , White People/genetics
7.
Bioinformatics ; 28(20): 2648-53, 2012 Oct 15.
Article in English | MEDLINE | ID: mdl-22923304

ABSTRACT

MOTIVATION: The biological pathway exchange language (BioPAX) and the systems biology markup language (SBML) belong to the most popular modeling and data exchange languages in systems biology. The focus of SBML is quantitative modeling and dynamic simulation of models, whereas the BioPAX specification concentrates mainly on visualization and qualitative analysis of pathway maps. BioPAX describes reactions and relations. In contrast, SBML core exclusively describes quantitative processes such as reactions. With the SBML qualitative models extension (qual), it has recently also become possible to describe relations in SBML. Before the development of SBML qual, relations could not be properly translated into SBML. Until now, there exists no BioPAX to SBML converter that is fully capable of translating both reactions and relations. RESULTS: The entire nature pathway interaction database has been converted from BioPAX (Level 2 and Level 3) into SBML (Level 3 Version 1) including both reactions and relations by using the new qual extension package. Additionally, we present the new webtool BioPAX2SBML for further BioPAX to SBML conversions. Compared with previous conversion tools, BioPAX2SBML is more comprehensive, more robust and more exact. AVAILABILITY: BioPAX2SBML is freely available at http://webservices.cs.uni-tuebingen.de/ and the complete collection of the PID models is available at http://www.cogsys.cs.uni-tuebingen.de/downloads/Qualitative-Models/.


Subject(s)
Programming Languages , Software , Systems Biology , Databases, Factual , Humans , Internet
8.
Hum Mutat ; 33(12): 1708-18, 2012 Dec.
Article in English | MEDLINE | ID: mdl-22777693

ABSTRACT

The success of genome-wide association studies (GWAS) in deciphering the genetic architecture of complex diseases has fueled the expectations whether the individual risk can also be quantified based on the genetic architecture. So far, disease risk prediction based on top-validated single-nucleotide polymorphisms (SNPs) showed little predictive value. Here, we applied a support vector machine (SVM) to Parkinson disease (PD) and type 1 diabetes (T1D), to show that apart from magnitude of effect size of risk variants, heritability of the disease also plays an important role in disease risk prediction. Furthermore, we performed a simulation study to show the role of uncommon (frequency 1-5%) as well as rare variants (frequency <1%) in disease etiology of complex diseases. Using a cross-validation model, we were able to achieve predictions with an area under the receiver operating characteristic curve (AUC) of ~0.88 for T1D, highlighting the strong heritable component (∼90%). This is in contrast to PD, where we were unable to achieve a satisfactory prediction (AUC ~0.56; heritability ~38%). Our simulations showed that simultaneous inclusion of uncommon and rare variants in GWAS would eventually lead to feasible disease risk prediction for complex diseases such as PD. The used software is available at http://www.ra.cs.uni-tuebingen.de/software/MACLEAPS/.


Subject(s)
Computer Simulation , Genome-Wide Association Study/methods , Models, Genetic , Support Vector Machine , Area Under Curve , Bipolar Disorder/diagnosis , Bipolar Disorder/genetics , Case-Control Studies , Diabetes Mellitus, Type 1/diagnosis , Diabetes Mellitus, Type 1/genetics , Diabetes Mellitus, Type 2/diagnosis , Diabetes Mellitus, Type 2/genetics , Genetic Predisposition to Disease , Humans , Parkinson Disease/diagnosis , Parkinson Disease/genetics , Polymorphism, Single Nucleotide , ROC Curve , Risk , Software
9.
PLoS One ; 7(4): e35327, 2012.
Article in English | MEDLINE | ID: mdl-22558141

ABSTRACT

DNA methylation of CpG islands plays a crucial role in the regulation of gene expression. More than half of all human promoters contain CpG islands with a tissue-specific methylation pattern in differentiated cells. Still today, the whole process of how DNA methyltransferases determine which region should be methylated is not completely revealed. There are many hypotheses of which genomic features are correlated to the epigenome that have not yet been evaluated. Furthermore, many explorative approaches of measuring DNA methylation are limited to a subset of the genome and thus, cannot be employed, e.g., for genome-wide biomarker prediction methods. In this study, we evaluated the correlation of genetic, epigenetic and hypothesis-driven features to DNA methylation of CpG islands. To this end, various binary classifiers were trained and evaluated by cross-validation on a dataset comprising DNA methylation data for 190 CpG islands in HEPG2, HEK293, fibroblasts and leukocytes. We achieved an accuracy of up to 91% with an MCC of 0.8 using ten-fold cross-validation and ten repetitions. With these models, we extended the existing dataset to the whole genome and thus, predicted the methylation landscape for the given cell types. The method used for these predictions is also validated on another external whole-genome dataset. Our results reveal features correlated to DNA methylation and confirm or disprove various hypotheses of DNA methylation related features. This study confirms correlations between DNA methylation and histone modifications, DNA structure, DNA sequence, genomic attributes and CpG island properties. Furthermore, the method has been validated on a genome-wide dataset from the ENCODE consortium. The developed software, as well as the predicted datasets and a web-service to compare methylation states of CpG islands are available at http://www.cogsys.cs.uni-tuebingen.de/software/dna-methylation/.


Subject(s)
Algorithms , CpG Islands/genetics , DNA Methylation/genetics , Gene Expression Regulation/genetics , Models, Genetic , Promoter Regions, Genetic/genetics , Software , Artificial Intelligence , Cell Line , DNA/chemistry , DNA/genetics , Histones/metabolism , Humans , Internet
SELECTION OF CITATIONS
SEARCH DETAIL
...