Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 9 de 9
Filter
1.
Article in English | MEDLINE | ID: mdl-38648123

ABSTRACT

Vulnerability to adversarial attacks is one of the principal hurdles to the adoption of deep learning in safety-critical applications. Despite significant efforts, both practical and theoretical, training deep learning models robust to adversarial attacks is still an open problem. In this article, we analyse the geometry of adversarial attacks in the over-parameterized limit for Bayesian neural networks (BNNs). We show that, in the limit, vulnerability to gradient-based attacks arises as a result of degeneracy in the data distribution, i.e., when the data lie on a lower dimensional submanifold of the ambient space. As a direct consequence, we demonstrate that in this limit, BNN posteriors are robust to gradient-based adversarial attacks. Crucially, by relying on the convergence of infinitely-wide BNNs to Gaussian processes (GPs), we prove that, under certain relatively mild assumptions, the expected gradient of the loss with respect to the BNN posterior distribution is vanishing, even when each NN sampled from the BNN posterior does not have vanishing gradients. The experimental results on the MNIST, Fashion MNIST, and a synthetic dataset with BNNs trained with Hamiltonian Monte Carlo and variational inference support this line of arguments, empirically showing that BNNs can display both high accuracy on clean data and robustness to both gradient-based and gradient-free adversarial attacks.

2.
BMC Med Res Methodol ; 23(1): 169, 2023 07 22.
Article in English | MEDLINE | ID: mdl-37481514

ABSTRACT

BACKGROUND: Machine learning (ML) methods to build prediction models starting from electrocardiogram (ECG) signals are an emerging research field. The aim of the present study is to investigate the performances of two ML approaches based on ECGs for the prediction of new-onset atrial fibrillation (AF), in terms of discrimination, calibration and sample size dependence. METHODS: We trained two models to predict new-onset AF: a convolutional neural network (CNN), that takes as input the raw ECG signals, and an eXtreme Gradient Boosting model (XGB), that uses the signal's extracted features. A penalized logistic regression model (LR) was used as a benchmark. Discrimination was evaluated with the area under the ROC curve, while calibration with the integrated calibration index. We investigated the dependence of models' performances on the sample size and on class imbalance corrections introduced with random under-sampling. RESULTS: CNN's discrimination was the most affected by the sample size, outperforming XGB and LR only around n = 10.000 observations. Calibration showed only a small dependence on the sample size for all the models considered. Balancing the training set with random undersampling did not improve discrimination in any of the models. Instead, the main effect of imbalance corrections was to worsen the models' calibration (for CNN, integrated calibration index from 0.014 [0.01, 0.018] to 0.17 [0.16, 0.19]). The sample size emerged as a fundamental point for developing the CNN model, especially in terms of discrimination (AUC = 0.75 [0.73, 0.77] when n = 10.000, AUC = 0.80 [0.79, 0.81] when n = 150.000). The effect of the sample size on the other two models was weaker. Imbalance corrections led to poorly calibrated models, for all the approaches considered, reducing the clinical utility of the models. CONCLUSIONS: Our results suggest that the choice of approach in the analysis of ECG should be based on the amount of data available, preferring more standard models for small datasets. Moreover, imbalance correction methods should be avoided when developing clinical prediction models, where calibration is crucial.


Subject(s)
Atrial Fibrillation , Humans , Atrial Fibrillation/diagnosis , Calibration , Electrocardiography , Benchmarking , Machine Learning
3.
Philos Trans R Soc Lond B Biol Sci ; 376(1824): 20200197, 2021 05 10.
Article in English | MEDLINE | ID: mdl-33745316

ABSTRACT

Can language relatedness be established without cognate words? This question has remained unresolved since the nineteenth century, leaving language prehistory beyond etymologically established families largely undefined. We address this problem through a theory of universal syntactic characters. We show that not only does syntax allow for comparison across distinct traditional language families, but that the probability of deeper historical relatedness between such families can be statistically tested through a dedicated algorithm which implements the concept of 'possible languages' suggested by a formal syntactic theory. Controversial clusters such as e.g. Altaic and Uralo-Altaic are significantly supported by our test, while other possible macro-groupings, e.g. Indo-Uralic or Basque-(Northeast) Caucasian, prove to be indistinguishable from a randomly generated distribution of language distances. These results suggest that syntactic diversity, modelled through a generative biolinguistic framework, can be used to provide a proof of historical relationship between different families irrespectively of the presence of a common lexicon from which regular sound correspondences can be determined; therefore, we argue that syntax may expand the time limits imposed by the classical comparative method. This article is part of the theme issue 'Reconstructing prehistoric languages'.


Subject(s)
Cultural Evolution , Language , Speech , Humans , Linguistics
4.
PLoS One ; 15(10): e0241394, 2020.
Article in English | MEDLINE | ID: mdl-33125408

ABSTRACT

We study continuous-time multi-agent models, where agents interact according to a network topology. At any point in time, each agent occupies a specific local node state. Agents change their state at random through interactions with neighboring agents. The time until a transition happens can follow an arbitrary probability density. Stochastic (Monte-Carlo) simulations are often the preferred-sometimes the only feasible-approach to study the complex emerging dynamical patterns of such systems. However, each simulation run comes with high computational costs mostly due to updating the instantaneous rates of interconnected agents after each transition. This work proposes a stochastic rejection-based, event-driven simulation algorithm that scales extremely well with the size and connectivity of the underlying contact network and produces statistically correct samples. We demonstrate the effectiveness of our method on different information spreading models.


Subject(s)
Computer Simulation , Stochastic Processes , Algorithms , Informatics , Markov Chains , Monte Carlo Method
5.
Basic Clin Pharmacol Toxicol ; 124(3): 312-320, 2019 Mar.
Article in English | MEDLINE | ID: mdl-30281896

ABSTRACT

Although continued use of antidepressants (AD) has been found to be associated with a lower risk of suicide, AD discontinuation is reported repeatedly. The aim of this case-control study, thus, was to assess whether discontinuation to AD was associated with an increased risk of suicide, according to different genders and age groups. The Social and Health Information System of Friuli Venezia Giulia Region, Italy, was used to collect data on suicides, diagnoses and AD use from 2005 to 2014. We selected, as cases, all suicides that had at least one prescription of AD in the 730 days before death (N = 876), and we matched with regard to age and sex each case with five controls from the general population. Conditional logistic regression analyses were used to assess the association between suicide and modifications of AD treatment. We found that 70% of suicides and controls from the general population discontinued AD in the 2 years before the index date. In two-thirds of them, discontinuations were two or more. Discontinuation of AD, however, did not represent a significant risk factor for suicide. More appropriate care of depression, particularly by primary care physicians who widely prescribe AD, should be fostered in order to prevent suicide. However, more research is needed to assess to which extent AD discontinuation can affect suicidal risk.


Subject(s)
Antidepressive Agents/administration & dosage , Substance Withdrawal Syndrome/epidemiology , Suicide/statistics & numerical data , Adolescent , Adult , Age Factors , Aged , Antidepressive Agents/adverse effects , Case-Control Studies , Child , Child, Preschool , Depressive Disorder/drug therapy , Depressive Disorder/epidemiology , Female , Humans , Italy/epidemiology , Male , Middle Aged , Prescriptions , Risk Factors , Sex Factors , Young Adult
6.
IEEE/ACM Trans Comput Biol Bioinform ; 15(4): 1180-1192, 2018.
Article in English | MEDLINE | ID: mdl-29990108

ABSTRACT

Calibrating parameters is a crucial problem within quantitative modeling approaches to reaction networks. Existing methods for stochastic models rely either on statistical sampling or can only be applied to small systems. Here, we present an inference procedure for stochastic models in equilibrium that is based on a moment matching scheme with optimal weighting and that can be used with high-throughput data like the one collected by flow cytometry. Our method does not require an approximation of the underlying equilibrium probability distribution and, if reaction rate constants have to be learned, the optimal values can be computed by solving a linear system of equations. We discuss important practical issues such as the selection of the moments and evaluate the effectiveness of the proposed approach on three case studies.


Subject(s)
Gene Regulatory Networks , Models, Biological , Systems Biology/methods , Gene Regulatory Networks/genetics , Gene Regulatory Networks/physiology , Stochastic Processes
7.
Phys Rev E ; 97(1-1): 012301, 2018 Jan.
Article in English | MEDLINE | ID: mdl-29448315

ABSTRACT

Contact processes form a large and highly interesting class of dynamic processes on networks, including epidemic and information-spreading networks. While devising stochastic models of such processes is relatively easy, analyzing them is very challenging from a computational point of view, particularly for large networks appearing in real applications. One strategy to reduce the complexity of their analysis is to rely on approximations, often in terms of a set of differential equations capturing the evolution of a random node, distinguishing nodes with different topological contexts (i.e., different degrees of different neighborhoods), such as degree-based mean-field (DBMF), approximate-master-equation (AME), or pair-approximation (PA) approaches. The number of differential equations so obtained is typically proportional to the maximum degree k_{max} of the network, which is much smaller than the size of the master equation of the underlying stochastic model, yet numerically solving these equations can still be problematic for large k_{max}. In this paper, we consider AME and PA, extended to cope with multiple local states, and we provide an aggregation procedure that clusters together nodes having similar degrees, treating those in the same cluster as indistinguishable, thus reducing the number of equations while preserving an accurate description of global observables of interest. We also provide an automatic way to build such equations and to identify a small number of degree clusters that give accurate results. The method is tested on several case studies, where it shows a high level of compression and a reduction of computational time of several orders of magnitude for large networks, with minimal loss in accuracy.

8.
Comput Biol Chem ; 56: 98-108, 2015 Jun.
Article in English | MEDLINE | ID: mdl-25909953

ABSTRACT

In this paper, we explore the impact of different forms of model abstraction and the role of discreteness on the dynamical behaviour of a simple model of gene regulation where a transcriptional repressor negatively regulates its own expression. We first investigate the relation between a minimal set of parameters and the system dynamics in a purely discrete stochastic framework, with the twofold purpose of providing an intuitive explanation of the different behavioural patterns exhibited and of identifying the main sources of noise. Then, we explore the effect of combining hybrid approaches and quasi-steady state approximations on model behaviour (and simulation time), to understand to what extent dynamics and quantitative features such as noise intensity can be preserved.


Subject(s)
Gene Regulatory Networks , Models, Genetic , Animals , Computer Simulation , Gene Expression Regulation , Humans , Repressor Proteins/genetics , Stochastic Processes , Transcriptional Activation
9.
BMC Struct Biol ; 7: 15, 2007 Mar 23.
Article in English | MEDLINE | ID: mdl-17378941

ABSTRACT

BACKGROUND: Reduced representations of proteins have been playing a keyrole in the study of protein folding. Many such models are available, with different representation detail. Although the usefulness of many such models for structural bioinformatics applications has been demonstrated in recent years, there are few intermediate resolution models endowed with an energy model capable, for instance, of detecting native or native-like structures among decoy sets. The aim of the present work is to provide a discrete empirical potential for a reduced protein model termed here PC2CA, because it employs a PseudoCovalent structure with only 2 Centers of interactions per Amino acid, suitable for protein model quality assessment. RESULTS: All protein structures in the set top500H have been converted in reduced form. The distribution of pseudobonds, pseudoangle, pseudodihedrals and distances between centers of interactions have been converted into potentials of mean force. A suitable reference distribution has been defined for non-bonded interactions which takes into account excluded volume effects and protein finite size. The correlation between adjacent main chain pseudodihedrals has been converted in an additional energetic term which is able to account for cooperative effects in secondary structure elements. Local energy surface exploration is performed in order to increase the robustness of the energy function. CONCLUSION: The model and the energy definition proposed have been tested on all the multiple decoys' sets in the Decoys'R'us database. The energetic model is able to recognize, for almost all sets, native-like structures (RMSD less than 2.0 A). These results and those obtained in the blind CASP7 quality assessment experiment suggest that the model compares well with scoring potentials with finer granularity and could be useful for fast exploration of conformational space. Parameters are available at the url: http://www.dstb.uniud.it/~ffogolari/download/.


Subject(s)
Models, Statistical , Proteins/chemistry , Thermodynamics , Amino Acid Sequence , Amino Acids/chemistry , Binding Sites , Caspase 7/chemistry , Computer Simulation , Crystallography, X-Ray , Databases, Protein , Models, Chemical , Monte Carlo Method , Protein Binding , Protein Conformation , Protein Folding , Protein Structure, Secondary , Sequence Analysis, Protein , Software , Solvents
SELECTION OF CITATIONS
SEARCH DETAIL
...