Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 4 de 4
Filter
Add more filters











Database
Language
Publication year range
1.
PLoS Comput Biol ; 19(3): e1010963, 2023 03.
Article in English | MEDLINE | ID: mdl-36917581

ABSTRACT

Estimating feature importance, which is the contribution of a prediction or several predictions due to a feature, is an essential aspect of explaining data-based models. Besides explaining the model itself, an equally relevant question is which features are important in the underlying data generating process. We present a Shapley-value-based framework for inferring the importance of individual features, including uncertainty in the estimator. We build upon the recently published model-agnostic feature importance score of SAGE (Shapley additive global importance) and introduce Sub-SAGE. For tree-based models, it has the advantage that it can be estimated without computationally expensive resampling. We argue that for all model types the uncertainties in our Sub-SAGE estimator can be estimated using bootstrapping and demonstrate the approach for tree ensemble methods. The framework is exemplified on synthetic data as well as large genotype data for predicting feature importance with respect to obesity.


Subject(s)
Genotyping Techniques , Uncertainty
2.
IEEE Trans Neural Netw Learn Syst ; 34(11): 9323-9336, 2023 Nov.
Article in English | MEDLINE | ID: mdl-35316196

ABSTRACT

Low complexity of a system model is essential for its use in real-time applications. However, sparse identification methods commonly have stringent requirements that exclude them from being applied in an industrial setting. In this article, we introduce a flexible method for the sparse identification of dynamical systems described by ordinary differential equations. Our method relieves many of the requirements imposed by other methods that relate to the structure of the model and the dataset, such as fixed sampling rates, full state measurements, and linearity of the model. The Levenberg-Marquardt algorithm is used to solve the identification problem. We show that the Levenberg-Marquardt algorithm can be written in a form that enables parallel computing, which greatly diminishes the time required to solve the identification problem. An efficient backward elimination strategy is presented to construct a lean system model.

3.
BMC Bioinformatics ; 22(1): 230, 2021 May 04.
Article in English | MEDLINE | ID: mdl-33947323

ABSTRACT

BACKGROUND: The identification of gene-gene and gene-environment interactions in genome-wide association studies is challenging due to the unknown nature of the interactions and the overwhelmingly large number of possible combinations. Parametric regression models are suitable to look for prespecified interactions. Nonparametric models such as tree ensemble models, with the ability to detect any unspecified interaction, have previously been difficult to interpret. However, with the development of methods for model explainability, it is now possible to interpret tree ensemble models efficiently and with a strong theoretical basis. RESULTS: We propose a tree ensemble- and SHAP-based method for identifying as well as interpreting potential gene-gene and gene-environment interactions on large-scale biobank data. A set of independent cross-validation runs are used to implicitly investigate the whole genome. We apply and evaluate the method using data from the UK Biobank with obesity as the phenotype. The results are in line with previous research on obesity as we identify top SNPs previously associated with obesity. We further demonstrate how to interpret and visualize interaction candidates. CONCLUSIONS: The new method identifies interaction candidates otherwise not detected with parametric regression models. However, further research is needed to evaluate the uncertainties of these candidates. The method can be applied to large-scale biobanks with high-dimensional data.


Subject(s)
Gene-Environment Interaction , Genome-Wide Association Study , Algorithms , Polymorphism, Single Nucleotide , Trees
4.
Phys Rev Lett ; 99(13): 131301, 2007 Sep 28.
Article in English | MEDLINE | ID: mdl-17930573

ABSTRACT

We constrain the lifetime of radiatively decaying dark matter in clusters of galaxies inspired by generic Kaluza-Klein axions, which have been invoked as a possible explanation for the solar coronal x-ray emission. These particles can be produced inside stars and remain confined by the gravitational potential of clusters. By analyzing x-ray observations of merging clusters, where gravitational lensing observations have identified massive, baryon poor structures, we derive the first cosmological lifetime constraint on this kind of particles of tau > or = 10(23) sec.

SELECTION OF CITATIONS
SEARCH DETAIL