Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 3 de 3
Filter
Add more filters










Database
Language
Publication year range
1.
Nat Commun ; 14(1): 1043, 2023 Feb 24.
Article in English | MEDLINE | ID: mdl-36823107

ABSTRACT

Given a finite and noisy dataset generated with a closed-form mathematical model, when is it possible to learn the true generating model from the data alone? This is the question we investigate here. We show that this model-learning problem displays a transition from a low-noise phase in which the true model can be learned, to a phase in which the observation noise is too high for the true model to be learned by any method. Both in the low-noise phase and in the high-noise phase, probabilistic model selection leads to optimal generalization to unseen data. This is in contrast to standard machine learning approaches, including artificial neural networks, which in this particular problem are limited, in the low-noise phase, by their ability to interpolate. In the transition region between the learnable and unlearnable phases, generalization is hard for all approaches including probabilistic model selection.

2.
Phys Rev Lett ; 124(8): 084503, 2020 Feb 28.
Article in English | MEDLINE | ID: mdl-32167370

ABSTRACT

Ever since Nikuradse's experiments on turbulent friction in 1933, there have been theoretical attempts to describe his measurements by collapsing the data into single-variable functions. However, this approach, which is common in other areas of physics and in other fields, is limited by the lack of rigorous quantitative methods to compare alternative data collapses. Here, we address this limitation by using an unsupervised method to find analytic functions that optimally describe each of the data collapses for the Nikuradse dataset. By descaling these analytic functions, we show that a low dispersion of the scaled data does not guarantee that a data collapse is a good description of the original data. In fact, we find that, out of all the proposed data collapses, the original one proposed by Prandtl and Nikuradse over 80 years ago provides the best description of the data so far, and that it also agrees well with recent experimental data, provided that some model parameters are allowed to vary across experiments.

3.
Sci Adv ; 6(5): eaav6971, 2020 Jan.
Article in English | MEDLINE | ID: mdl-32064326

ABSTRACT

Closed-form, interpretable mathematical models have been instrumental for advancing our understanding of the world; with the data revolution, we may now be in a position to uncover new such models for many systems from physics to the social sciences. However, to deal with increasing amounts of data, we need "machine scientists" that are able to extract these models automatically from data. Here, we introduce a Bayesian machine scientist, which establishes the plausibility of models using explicit approximations to the exact marginal posterior over models and establishes its prior expectations about models by learning from a large empirical corpus of mathematical expressions. It explores the space of models using Markov chain Monte Carlo. We show that this approach uncovers accurate models for synthetic and real data and provides out-of-sample predictions that are more accurate than those of existing approaches and of other nonparametric methods.

SELECTION OF CITATIONS
SEARCH DETAIL
...