Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 33
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Sci Rep ; 13(1): 14879, 2023 Sep 09.
Artigo em Inglês | MEDLINE | ID: mdl-37689770

RESUMO

We use an agnostic information-theoretic approach to investigate the statistical properties of natural images. We introduce the Multiscale Relevance (MSR) measure to assess the robustness of images to compression at all scales. Starting in a controlled environment, we characterize the MSR of synthetic random textures as function of image roughness [Formula: see text] and other relevant parameters. We then extend the analysis to natural images and find striking similarities with critical ([Formula: see text]) random textures. We show that the MSR is more robust and informative of image content than classical methods such as power spectrum analysis. Finally, we confront the MSR to classical measures for the calibration of common procedures such as color mapping and denoising. Overall, the MSR approach appears to be a good candidate for advanced image analysis and image processing, while providing a good level of physical interpretability.

2.
Phys Rev E ; 103(5-1): 052121, 2021 May.
Artigo em Inglês | MEDLINE | ID: mdl-34134259

RESUMO

A 1929 Gedankenexperiment proposed by Szilárd, often referred to as "Szilárd's engine", has served as a foundation for computing fundamental thermodynamic bounds to information processing. While Szilárd's original box could be partitioned into two halves and contains one gas molecule, we calculate here the maximal average work that can be extracted in a system with N particles and q partitions, given an observer which counts the molecules in each partition, and given a work extraction mechanism that is limited to pressure equalization. We find that the average extracted work is proportional to the mutual information between the one-particle position and the vector containing the counts of how many particles are in each partition. We optimize this quantity over the initial locations of the dividing walls, and find that there exists a critical number of particles N^{★}(q) below which the extracted work is maximized by a symmetric configuration of the q partitions, and above which the optimal partitioning is asymmetric. Overall, the average extracted work is maximized for a number of particles N[over ̂](q)

3.
PLoS One ; 15(10): e0239331, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33104709

RESUMO

Clustering and community detection provide a concise way of extracting meaningful information from large datasets. An ever growing plethora of data clustering and community detection algorithms have been proposed. In this paper, we address the question of ranking the performance of clustering algorithms for a given dataset. We show that, for hard clustering and community detection, Linsker's Infomax principle can be used to rank clustering algorithms. In brief, the algorithm that yields the highest value of the entropy of the partition, for a given number of clusters, is the best one. We show indeed, on a wide range of datasets of various sizes and topological structures, that the ranking provided by the entropy of the partition over a variety of partitioning algorithms is strongly correlated with the overlap with a ground truth partition The codes related to the project are available in https://github.com/Sandipan99/Ranking_cluster_algorithms.


Assuntos
Algoritmos , Interface Usuário-Computador , Análise por Conglomerados , Bases de Dados Factuais
4.
J Comput Neurosci ; 48(1): 85-102, 2020 02.
Artigo em Inglês | MEDLINE | ID: mdl-31993923

RESUMO

Neuronal responses to complex stimuli and tasks can encompass a wide range of time scales. Understanding these responses requires measures that characterize how the information on these response patterns are represented across multiple temporal resolutions. In this paper we propose a metric - which we call multiscale relevance (MSR) - to capture the dynamical variability of the activity of single neurons across different time scales. The MSR is a non-parametric, fully featureless indicator in that it uses only the time stamps of the firing activity without resorting to any a priori covariate or invoking any specific structure in the tuning curve for neural activity. When applied to neural data from the mEC and from the ADn and PoS regions of freely-behaving rodents, we found that neurons having low MSR tend to have low mutual information and low firing sparsity across the correlates that are believed to be encoded by the region of the brain where the recordings were made. In addition, neurons with high MSR contain significant information on spatial navigation and allow to decode spatial position or head direction as efficiently as those neurons whose firing activity has high mutual information with the covariate to be decoded and significantly better than the set of neurons with high local variations in their interspike intervals. Given these results, we propose that the MSR can be used as a measure to rank and select neurons for their information content without the need to appeal to any a priori covariate.


Assuntos
Potenciais de Ação/fisiologia , Fenômenos Eletrofisiológicos/fisiologia , Neurônios/fisiologia , Algoritmos , Animais , Núcleos Anteriores do Tálamo/fisiologia , Teorema de Bayes , Encéfalo/fisiologia , Córtex Entorrinal/fisiologia , Cabeça , Teoria da Informação , Camundongos , Ratos , Roedores , Percepção Espacial/fisiologia
5.
Neural Comput ; 31(8): 1592-1623, 2019 08.
Artigo em Inglês | MEDLINE | ID: mdl-31260388

RESUMO

We investigate the complexity of logistic regression models, which is defined by counting the number of indistinguishable distributions that the model can represent (Balasubramanian, 1997). We find that the complexity of logistic models with binary inputs depends not only on the number of parameters but also on the distribution of inputs in a nontrivial way that standard treatments of complexity do not address. In particular, we observe that correlations among inputs induce effective dependencies among parameters, thus constraining the model and, consequently, reducing its complexity. We derive simple relations for the upper and lower bounds of the complexity. Furthermore, we show analytically that defining the model parameters on a finite support rather than the entire axis decreases the complexity in a manner that critically depends on the size of the domain. Based on our findings, we propose a novel model selection criterion that takes into account the entropy of the input distribution. We test our proposal on the problem of selecting the input variables of a logistic regression model in a Bayesian model selection framework. In our numerical tests, we find that while the reconstruction errors of standard model selection approaches (AIC, BIC, ℓ1 regularization) strongly depend on the sparsity of the ground truth, the reconstruction error of our method is always close to the minimum in all conditions of sparsity, data size, and strength of input correlations. Finally, we observe that when considering categorical instead of binary inputs, in a simple and mathematically tractable case, the contribution of the alphabet size to the complexity is very small compared to that of parameter space dimension. We further explore the issue by analyzing the data set of the "13 keys to the White House," a method for forecasting the outcomes of US presidential elections.

6.
Cell Rep ; 27(9): 2759-2771.e5, 2019 05 28.
Artigo em Inglês | MEDLINE | ID: mdl-31141697

RESUMO

Loss of functional cardiomyocytes is a major determinant of heart failure after myocardial infarction. Previous high throughput screening studies have identified a few microRNAs (miRNAs) that can induce cardiomyocyte proliferation and stimulate cardiac regeneration in mice. Here, we show that all of the most effective of these miRNAs activate nuclear localization of the master transcriptional cofactor Yes-associated protein (YAP) and induce expression of YAP-responsive genes. In particular, miR-199a-3p directly targets two mRNAs coding for proteins impinging on the Hippo pathway, the upstream YAP inhibitory kinase TAOK1, and the E3 ubiquitin ligase ß-TrCP, which leads to YAP degradation. Several of the pro-proliferative miRNAs (including miR-199a-3p) also inhibit filamentous actin depolymerization by targeting Cofilin2, a process that by itself activates YAP nuclear translocation. Thus, activation of YAP and modulation of the actin cytoskeleton are major components of the pro-proliferative action of miR-199a-3p and other miRNAs that induce cardiomyocyte proliferation.


Assuntos
Proteínas Reguladoras de Apoptose/metabolismo , Biomarcadores/metabolismo , Proliferação de Células , MicroRNAs/genética , Miócitos Cardíacos/citologia , Miócitos Cardíacos/metabolismo , Citoesqueleto de Actina , Animais , Animais Recém-Nascidos , Proteínas Reguladoras de Apoptose/genética , Cofilina 2/genética , Cofilina 2/metabolismo , Feminino , Masculino , Ratos , Proteínas de Sinalização YAP
7.
Entropy (Basel) ; 20(10)2018 Sep 27.
Artigo em Inglês | MEDLINE | ID: mdl-33265828

RESUMO

Models can be simple for different reasons: because they yield a simple and computationally efficient interpretation of a generic dataset (e.g., in terms of pairwise dependencies)-as in statistical learning-or because they capture the laws of a specific phenomenon-as e.g., in physics-leading to non-trivial falsifiable predictions. In information theory, the simplicity of a model is quantified by the stochastic complexity, which measures the number of bits needed to encode its parameters. In order to understand how simple models look like, we study the stochastic complexity of spin models with interactions of arbitrary order. We show that bijections within the space of possible interactions preserve the stochastic complexity, which allows to partition the space of all models into equivalence classes. We thus found that the simplicity of a model is not determined by the order of the interactions, but rather by their mutual arrangements. Models where statistical dependencies are localized on non-overlapping groups of few variables are simple, affording predictions on independencies that are easy to falsify. On the contrary, fully connected pairwise models, which are often used in statistical learning, appear to be highly complex, because of their extended set of interactions, and they are hard to falsify.

8.
Entropy (Basel) ; 20(10)2018 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-33265844

RESUMO

In the Minimum Description Length (MDL) principle, learning from the data is equivalent to an optimal coding problem. We show that the codes that achieve optimal compression in MDL are critical in a very precise sense. First, when they are taken as generative models of samples, they generate samples with broad empirical distributions and with a high value of the relevance, defined as the entropy of the empirical frequencies. These results are derived for different statistical models (Dirichlet model, independent and pairwise dependent spin models, and restricted Boltzmann machines). Second, MDL codes sit precisely at a second order phase transition point where the symmetry between the sampled outcomes is spontaneously broken. The order parameter controlling the phase transition is the coding cost of the samples. The phase transition is a manifestation of the optimality of MDL codes, and it arises because codes that achieve a higher compression do not exist. These results suggest a clear interpretation of the widespread occurrence of statistical criticality as a characterization of samples which are maximally informative on the underlying generative process.

9.
Biophys J ; 113(1): 206-213, 2017 Jul 11.
Artigo em Inglês | MEDLINE | ID: mdl-28700919

RESUMO

Competition to bind microRNAs induces an effective positive cross talk between their targets, which are therefore known as "competing endogenous RNAs" (ceRNAs). Although such an effect is known to play a significant role in specific situations, estimating its strength from data and experimentally in physiological conditions appears to be far from simple. Here, we show that the susceptibility of ceRNAs to different types of perturbations affecting their competitors (and hence their tendency to cross talk) can be encoded in quantities as intuitive and as simple to measure as correlation functions. This scenario is confirmed by extensive numerical simulations and validated by re-analyzing phosphatase and tensin homolog's cross-talk pattern from The Cancer Genome Atlas breast cancer database. These results clarify the links between different quantities used to estimate the intensity of ceRNA cross talk and provide, to our knowledge, new keys to analyze transcriptional data sets and effectively probe ceRNA networks in silico.


Assuntos
Algoritmos , Ligação Competitiva , MicroRNAs/metabolismo , Modelos Biológicos , Modelos Moleculares , Neoplasias da Mama/metabolismo , Simulação por Computador , Proteínas de Ligação a DNA/química , Proteínas de Ligação a DNA/metabolismo , Bases de Dados Genéticas , Perfilação da Expressão Gênica , Humanos , Cinética , MicroRNAs/química , Proteínas Nucleares/química , Proteínas Nucleares/metabolismo , Proteínas de Ligação a RNA/química , Proteínas de Ligação a RNA/metabolismo , Processos Estocásticos , Tensinas/química , Tensinas/metabolismo , Transcrição Gênica/fisiologia
10.
Sci Rep ; 7(1): 3096, 2017 06 08.
Artigo em Inglês | MEDLINE | ID: mdl-28596593

RESUMO

Random number generation plays an essential role in technology with important applications in areas ranging from cryptography to Monte Carlo methods, and other probabilistic algorithms. All such applications require high-quality sources of random numbers, yet effective methods for assessing whether a source produce truly random sequences are still missing. Current methods either do not rely on a formal description of randomness (NIST test suite) on the one hand, or are inapplicable in principle (the characterization derived from the Algorithmic Theory of Information), on the other, for they require testing all the possible computer programs that could produce the sequence to be analysed. Here we present a rigorous method that overcomes these problems based on Bayesian model selection. We derive analytic expressions for a model's likelihood which is then used to compute its posterior distribution. Our method proves to be more rigorous than NIST's suite and Borel-Normality criterion and its implementation is straightforward. We applied our method to an experimental device based on the process of spontaneous parametric downconversion to confirm it behaves as a genuine quantum random number generator. As our approach relies on Bayesian inference our scheme transcends individual sequence analysis, leading to a characterization of the source itself.

11.
Mol Biosyst ; 12(7): 2147-58, 2016 06 21.
Artigo em Inglês | MEDLINE | ID: mdl-26974515

RESUMO

Evolution in its course has found a variety of solutions to the same optimisation problem. The advent of high-throughput genomic sequencing has made available extensive data from which, in principle, one can infer the underlying structure on which biological functions rely. In this paper, we present a new method aimed at the extraction of sites encoding structural and functional properties from a set of protein primary sequences, namely a multiple sequence alignment. The method, called critical variable selection, is based on the idea that subsets of relevant sites correspond to subsequences that occur with a particularly broad frequency distribution in the dataset. By applying this algorithm to in silico sequences, to the response regulator receiver and to the voltage sensor domain of ion channels, we show that this procedure recovers not only the information encoded in single site statistics and pairwise correlations but also captures dependencies going beyond pairwise correlations. The method proposed here is complementary to statistical coupling analysis, in that the most relevant sites predicted by the two methods differ markedly. We find robust and consistent results for datasets as small as few hundred sequences that reveal a hidden hierarchy of sites that are consistent with the present knowledge on biologically relevant sites and evolutionary dynamics. This suggests that critical variable selection is capable of identifying a core of sites encoding functional and structural information in a multiple sequence alignment.


Assuntos
Aminoácidos/química , Aminoácidos/genética , Códon , Variação Genética , Proteínas/química , Proteínas/genética , Seleção Genética , Algoritmos , Sequência de Aminoácidos , Substituição de Aminoácidos , Biologia Computacional/métodos , Simulação por Computador , Modelos Moleculares , Modelos Estatísticos , Conformação Proteica
12.
Artigo em Inglês | MEDLINE | ID: mdl-26274227

RESUMO

System-level properties of metabolic networks may be the direct product of natural selection or arise as a by-product of selection on other properties. Here we study the effect of direct selective pressure for growth or viability in particular environments on two properties of metabolic networks: latent versatility to function in additional environments and carbon usage efficiency. Using a Markov chain Monte Carlo (MCMC) sampling based on flux balance analysis (FBA), we sample from a known biochemical universe random viable metabolic networks that differ in the number of directly constrained environments. We find that the latent versatility of sampled metabolic networks increases with the number of directly constrained environments and with the size of the networks. We then show that the average carbon wastage of sampled metabolic networks across the constrained environments decreases with the number of directly constrained environments and with the size of the networks. Our work expands the growing body of evidence about nonadaptive origins of key functional properties of biological networks.


Assuntos
Carbono/metabolismo , Redes e Vias Metabólicas/genética , Redes e Vias Metabólicas/fisiologia , Modelos Biológicos , Fenótipo , Escherichia coli/genética , Escherichia coli/metabolismo , Genótipo , Cadeias de Markov , Método de Monte Carlo
13.
PLoS One ; 9(4): e94237, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24728096

RESUMO

Using public data (Forbes Global 2000) we show that the asset sizes for the largest global firms follow a Pareto distribution in an intermediate range, that is "interrupted" by a sharp cut-off in its upper tail, where it is totally dominated by financial firms. This flattening of the distribution contrasts with a large body of empirical literature which finds a Pareto distribution for firm sizes both across countries and over time. Pareto distributions are generally traced back to a mechanism of proportional random growth, based on a regime of constant returns to scale. This makes our findings of an "interrupted" Pareto distribution all the more puzzling, because we provide evidence that financial firms in our sample should operate in such a regime. We claim that the missing mass from the upper tail of the asset size distribution is a consequence of shadow banking activity and that it provides an (upper) estimate of the size of the shadow banking system. This estimate-which we propose as a shadow banking index-compares well with estimates of the Financial Stability Board until 2009, but it shows a sharper rise in shadow banking activity after 2010. Finally, we propose a proportional random growth model that reproduces the observed distribution, thereby providing a quantitative estimate of the intensity of shadow banking activity.

14.
J R Soc Interface ; 11(95): 20140043, 2014 Jun 06.
Artigo em Inglês | MEDLINE | ID: mdl-24647905

RESUMO

Animals form groups for many reasons, but there are costs and benefits associated with group formation. One of the benefits is collective memory. In groups on the move, social interactions play a crucial role in the cohesion and the ability to make consensus decisions. When migrating from spawning to feeding areas, fish schools need to retain a collective memory of the destination site over thousands of kilometres, and changes in group formation or individual preference can produce sudden changes in migration pathways. We propose a modelling framework, based on stochastic adaptive networks, that can reproduce this collective behaviour. We assume that three factors control group formation and school migration behaviour: the intensity of social interaction, the relative number of informed individuals and the strength of preference that informed individuals have for a particular migration area. We treat these factors independently and relate the individuals' preferences to the experience and memory for certain migration sites. We demonstrate that removal of knowledgeable individuals or alteration of individual preference can produce rapid changes in group formation and collective behaviour. For example, intensive fishing targeting the migratory species and also their preferred prey can reduce both terms to a point at which migration to the destination sites is suddenly stopped. The conceptual approaches represented by our modelling framework may therefore be able to explain large-scale changes in fish migration and spatial distribution.


Assuntos
Migração Animal/fisiologia , Memória , Modelos Biológicos , Comportamento Social , Animais
16.
Phys Rev E Stat Nonlin Soft Matter Phys ; 85(2 Pt 1): 021116, 2012 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-22463162

RESUMO

We define and study a class of resource allocation processes where gN agents, by repeatedly visiting N resources, try to converge to an optimal configuration where each resource is occupied by at most one agent. The process exhibits a phase transition, as the density g of agents grows, from an absorbing to an active phase. In the latter, even if the number of resources is in principle enough for all agents (g<1), the system never settles to a frozen configuration. We recast these processes in terms of zero-range interacting particles, studying analytically the mean field dynamics and investigating numerically the phase transition in finite dimensions. We find a good agreement with the critical exponents of the stochastic fixed-energy sandpile. The lack of coordination in the active phase also leads to a nontrivial faster-is-slower effect.


Assuntos
Aglomeração , Teoria dos Jogos , Modelos Teóricos , Transição de Fase , Alocação de Recursos , Animais , Simulação por Computador , Humanos
17.
Proc Natl Acad Sci U S A ; 109(12): 4395-400, 2012 Mar 20.
Artigo em Inglês | MEDLINE | ID: mdl-22383559

RESUMO

The very notion of social network implies that linked individuals interact repeatedly with each other. This notion allows them not only to learn successful strategies and adapt to them, but also to condition their own behavior on the behavior of others, in a strategic forward looking manner. Game theory of repeated games shows that these circumstances are conducive to the emergence of collaboration in simple games of two players. We investigate the extension of this concept to the case where players are engaged in a local contribution game and show that rationality and credibility of threats identify a class of Nash equilibria--that we call "collaborative equilibria"--that have a precise interpretation in terms of subgraphs of the social network. For large network games, the number of such equilibria is exponentially large in the number of players. When incentives to defect are small, equilibria are supported by local structures whereas when incentives exceed a threshold they acquire a nonlocal nature, which requires a "critical mass" of more than a given fraction of the players to collaborate. Therefore, when incentives are high, an individual deviation typically causes the collapse of collaboration across the whole system. At the same time, higher incentives to defect typically support equilibria with a higher density of collaborators. The resulting picture conforms with several results in sociology and in the experimental literature on game theory, such as the prevalence of collaboration in denser groups and in the structural hubs of sparse networks.


Assuntos
Apoio Social , Algoritmos , Comunicação , Comportamento Cooperativo , Teoria dos Jogos , Humanos , Modelos Psicológicos , Modelos Estatísticos , Modelos Teóricos
18.
PLoS One ; 6(5): e20207, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-21637714

RESUMO

Social learning is defined as the ability of a population to aggregate information, a process which must crucially depend on the mechanisms of social interaction. Consumers choosing which product to buy, or voters deciding which option to take with respect to an important issue, typically confront external signals to the information gathered from their contacts. Economic models typically predict that correct social learning occurs in large populations unless some individuals display unbounded influence. We challenge this conclusion by showing that an intuitive threshold process of individual adjustment does not always lead to such social learning. We find, specifically, that three generic regimes exist separated by sharp discontinuous transitions. And only in one of them, where the threshold is within a suitable intermediate range, the population learns the correct information. In the other two, where the threshold is either too high or too low, the system either freezes or enters into persistent flux, respectively. These regimes are generally observed in different social networks (both complex or regular), but limited interaction is found to promote correct learning by enlarging the parameter region where it occurs.


Assuntos
Aprendizagem/fisiologia , Limiar Sensorial/fisiologia , Apoio Social , Simulação por Computador , Modelos Biológicos , Análise Numérica Assistida por Computador , Fatores de Tempo
19.
Proc Natl Acad Sci U S A ; 106(28): 11433-8, 2009 Jul 14.
Artigo em Inglês | MEDLINE | ID: mdl-19571013

RESUMO

Networks describe a variety of interacting complex systems in social science, biology, and information technology. Usually the nodes of real networks are identified not only by their connections but also by some other characteristics. Examples of characteristics of nodes can be age, gender, or nationality of a person in a social network, the abundance of proteins in the cell taking part in protein-interaction networks, or the geographical position of airports that are connected by directed flights. Integrating the information on the connections of each node with the information about its characteristics is crucial to discriminating between the essential and negligible characteristics of nodes for the structure of the network. In this paper we propose a general indicator Theta, based on entropy measures, to quantify the dependence of a network's structure on a given set of features. We apply this method to social networks of friendships in U.S. schools, to the protein-interaction network of Saccharomyces cerevisiae and to the U.S. airport network, showing that the proposed measure provides information that complements other known measures.


Assuntos
Modelos Teóricos , Mapeamento de Interação de Proteínas/métodos , Saccharomyces cerevisiae/metabolismo , Apoio Social , Meios de Transporte , Entropia , Humanos
20.
Phys Rev E Stat Nonlin Soft Matter Phys ; 79(1 Pt 2): 015101, 2009 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-19257095

RESUMO

We define a minimal model of traffic flows in complex networks in order to study the trade-off between topological-based and traffic-based routing strategies. The resulting collective behavior is obtained analytically for an ensemble of uncorrelated networks and summarized in a rich phase diagram presenting second-order as well as first-order phase transitions between a free-flow phase and a congested phase. We find that traffic control improves global performance, enlarging the free-flow region in parameter space only in heterogeneous networks. Traffic control introduces nonlinear effects and, beyond a critical strength, may trigger the appearance of a congested phase in a discontinuous manner. The model also reproduces the crossover in the scaling of traffic fluctuations empirically observed on the Internet.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...