Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 33
Filter
Add more filters










Publication year range
1.
Sci Rep ; 14(1): 13730, 2024 Jun 14.
Article in English | MEDLINE | ID: mdl-38877083

ABSTRACT

The Friendship Paradox is a simple and powerful statement about node degrees in a graph. However, it only applies to undirected graphs with no edge weights, and the only node characteristic it concerns is degree. Since many social networks are more complex than that, it is useful to generalize this phenomenon, if possible, and a number of papers have proposed different generalizations. Here, we unify these generalizations in a common framework, retaining the focus on undirected graphs and allowing for weighted edges and for numeric node attributes other than degree to be considered, since this extension allows for a clean characterization and links to the original concepts most naturally. While the original Friendship Paradox and the Weighted Friendship Paradox hold for all graphs, considering non-degree attributes actually makes the extensions fail around 50% of the time, given random attribute assignment. We provide simple correlation-based rules to see whether an attribute-based version of the paradox holds. In addition to theory, our simulation and data results show how all the concepts can be applied to synthetic and real networks. Where applicable, we draw connections to prior work to make this an accessible and comprehensive paper that lets one understand the math behind the Friendship Paradox and its basic extensions.

2.
Perspect Psychol Sci ; : 17456916231212138, 2023 Dec 12.
Article in English | MEDLINE | ID: mdl-38085919

ABSTRACT

More and more machine learning is applied to human behavior. Increasingly these algorithms suffer from a hidden-but serious-problem. It arises because they often predict one thing while hoping for another. Take a recommender system: It predicts clicks but hopes to identify preferences. Or take an algorithm that automates a radiologist: It predicts in-the-moment diagnoses while hoping to identify their reflective judgments. Psychology shows us the gaps between the objectives of such prediction tasks and the goals we hope to achieve: People can click mindlessly; experts can get tired and make systematic errors. We argue such situations are ubiquitous and call them "inversion problems": The real goal requires understanding a mental state that is not directly measured in behavioral data but must instead be inverted from the behavior. Identifying and solving these problems require new tools that draw on both behavioral and computational science.

4.
Sci Rep ; 13(1): 7982, 2023 May 17.
Article in English | MEDLINE | ID: mdl-37198220
6.
PNAS Nexus ; 2(3): pgad003, 2023 Mar.
Article in English | MEDLINE | ID: mdl-36926225

ABSTRACT

Contact tracing is a key tool for managing epidemic diseases like HIV, tuberculosis, COVID-19, and monkeypox. Manual investigations by human-contact tracers remain a dominant way in which this is carried out. This process is limited by the number of contact tracers available, who are often overburdened during an outbreak or epidemic. As a result, a crucial decision in any contact tracing strategy is, given a set of contacts, which person should a tracer trace next? In this work, we develop a formal model that articulates these questions and provides a framework for comparing contact tracing strategies. Through analyzing our model, we give provably optimal prioritization policies via a clean connection to a tool from operations research called a "branching bandit". Examining these policies gives qualitative insight into trade-offs in contact tracing applications.

7.
Sci Rep ; 13(1): 2074, 2023 Feb 06.
Article in English | MEDLINE | ID: mdl-36746993

ABSTRACT

The Friendship Paradox-the principle that "your friends have more friends than you do"-is a combinatorial fact about degrees in a graph; but given that many web-based social activities are correlated with a user's degree, this fact has been taken more broadly to suggest the empirical principle that "your friends are also more active than you are." This Generalized Friendship Paradox, the notion that any attribute positively correlated with degree obeys the Friendship Paradox, has been established mathematically in a network-level version that essentially aggregates uniformly over all the edges of a network. Here we show, however, that the natural node-based version of the Generalized Friendship Paradox-which aggregates over nodes, not edges-may fail, even for degree-attribute correlations approaching 1. Whether this version holds depends not only on degree-attribute correlations, but also on the underlying network structure and thus can't be said to be a universal phenomenon. We establish both positive and negative results for this node-based version of the Generalized Friendship Paradox and consider its implications for social-network data.

8.
Sci Adv ; 9(1): eabq3200, 2023 Jan 06.
Article in English | MEDLINE | ID: mdl-36608141

ABSTRACT

Homophily is the seemingly ubiquitous tendency for people to connect and interact with other individuals who are similar to them. This is a well-documented principle and is fundamental for how society organizes. Although many social interactions occur in groups, homophily has traditionally been measured using a graph model, which only accounts for pairwise interactions involving two individuals. Here, we develop a framework using hypergraphs to quantify homophily from group interactions. This reveals natural patterns of group homophily that appear with gender in scientific collaboration and political affiliation in legislative bill cosponsorship and also reveals distinctive gender distributions in group photographs, all of which cannot be fully captured by pairwise measures. At the same time, we show that seemingly natural ways to define group homophily are combinatorially impossible. This reveals important pitfalls to avoid when defining and interpreting notions of group homophily, as higher-order homophily patterns are governed by combinatorial constraints that are independent of human behavior but are easily overlooked.

9.
Sci Rep ; 12(1): 18507, 2022 Nov 02.
Article in English | MEDLINE | ID: mdl-36323714
10.
Sci Rep ; 11(1): 17230, 2021 Aug 20.
Article in English | MEDLINE | ID: mdl-34417527
11.
Nature ; 595(7866): 181-188, 2021 07.
Article in English | MEDLINE | ID: mdl-34194044

ABSTRACT

Computational social science is more than just large repositories of digital data and the computational methods needed to construct and analyse them. It also represents a convergence of different fields with different ways of thinking about and doing science. The goal of this Perspective is to provide some clarity around how these approaches differ from one another and to propose how they might be productively integrated. Towards this end we make two contributions. The first is a schema for thinking about research activities along two dimensions-the extent to which work is explanatory, focusing on identifying and estimating causal effects, and the degree of consideration given to testing predictions of outcomes-and how these two priorities can complement, rather than compete with, one another. Our second contribution is to advocate that computational social scientists devote more attention to combining prediction and explanation, which we call integrative modelling, and to outline some practical suggestions for realizing this goal.


Subject(s)
Computer Simulation , Data Science/methods , Forecasting/methods , Models, Theoretical , Social Sciences/methods , Goals , Humans
12.
Sci Rep ; 11(1): 13360, 2021 06 25.
Article in English | MEDLINE | ID: mdl-34172813

ABSTRACT

Homophily-the tendency of nodes to connect to others of the same type-is a central issue in the study of networks. Here we take a local view of homophily, defining notions of first-order homophily of a node (its individual tendency to link to similar others) and second-order homophily of a node (the aggregate first-order homophily of its neighbors). Through this view, we find a surprising result for homophily values that applies with only minimal assumptions on the graph topology. It can be phrased most simply as "in a graph of red and blue nodes, red friends of red nodes are on average more homophilous than red friends of blue nodes". This gap in averages defies simple intuitive explanations, applies to globally heterophilous and homophilous networks and is reminiscent of but structually distinct from the Friendship Paradox. The existence of this gap suggests intrinsic biases in homophily measurements between groups, and hence is relevant to empirical studies of homophily in networks.

13.
Proc Natl Acad Sci U S A ; 118(22)2021 06 01.
Article in English | MEDLINE | ID: mdl-34035166

ABSTRACT

As algorithms are increasingly applied to screen applicants for high-stakes decisions in employment, lending, and other domains, concerns have been raised about the effects of algorithmic monoculture, in which many decision-makers all rely on the same algorithm. This concern invokes analogies to agriculture, where a monocultural system runs the risk of severe harm from unexpected shocks. Here, we show that the dangers of algorithmic monoculture run much deeper, in that monocultural convergence on a single algorithm by a group of decision-making agents, even when the algorithm is more accurate for any one agent in isolation, can reduce the overall quality of the decisions being made by the full collection of agents. Unexpected shocks are therefore not needed to expose the risks of monoculture; it can hurt accuracy even under "normal" operations and even for algorithms that are more accurate when used by only a single decision-maker. Our results rely on minimal assumptions and involve the development of a probabilistic framework for analyzing systems that use multiple noisy estimates of a set of alternatives.


Subject(s)
Algorithms , Culture , Models, Theoretical , Social Welfare , Humans
14.
Proc Natl Acad Sci U S A ; 117(48): 30096-30100, 2020 12 01.
Article in English | MEDLINE | ID: mdl-32723823

ABSTRACT

Preventing discrimination requires that we have means of detecting it, and this can be enormously difficult when human beings are making the underlying decisions. As applied today, algorithms can increase the risk of discrimination. But as we argue here, algorithms by their nature require a far greater level of specificity than is usually possible with human decision making, and this specificity makes it possible to probe aspects of the decision in additional ways. With the right changes to legal and regulatory systems, algorithms can thus potentially make it easier to detect-and hence to help prevent-discrimination.

15.
Proc Natl Acad Sci U S A ; 115(48): E11221-E11230, 2018 11 27.
Article in English | MEDLINE | ID: mdl-30413619

ABSTRACT

Networks provide a powerful formalism for modeling complex systems by using a model of pairwise interactions. But much of the structure within these systems involves interactions that take place among more than two nodes at once-for example, communication within a group rather than person to person, collaboration among a team rather than a pair of coauthors, or biological interaction between a set of molecules rather than just two. Such higher-order interactions are ubiquitous, but their empirical study has received limited attention, and little is known about possible organizational principles of such structures. Here we study the temporal evolution of 19 datasets with explicit accounting for higher-order interactions. We show that there is a rich variety of structure in our datasets but datasets from the same system types have consistent patterns of higher-order structure. Furthermore, we find that tie strength and edge density are competing positive indicators of higher-order organization, and these trends are consistent across interactions involving differing numbers of nodes. To systematically further the study of theories for such higher-order structures, we propose higher-order link prediction as a benchmark problem to assess models and algorithms that predict higher-order structure. We find a fundamental difference from traditional pairwise link prediction, with a greater role for local rather than long-range information in predicting the appearance of new interactions.

16.
Q J Econ ; 133(1): 237-293, 2018 Feb 01.
Article in English | MEDLINE | ID: mdl-29755141

ABSTRACT

Can machine learning improve human decision making? Bail decisions provide a good test case. Millions of times each year, judges make jail-or-release decisions that hinge on a prediction of what a defendant would do if released. The concreteness of the prediction task combined with the volume of data available makes this a promising machine-learning application. Yet comparing the algorithm to judges proves complicated. First, the available data are generated by prior judge decisions. We only observe crime outcomes for released defendants, not for those judges detained. This makes it hard to evaluate counterfactual decision rules based on algorithmic predictions. Second, judges may have a broader set of preferences than the variable the algorithm predicts; for instance, judges may care specifically about violent crimes or about racial inequities. We deal with these problems using different econometric strategies, such as quasi-random assignment of cases to judges. Even accounting for these concerns, our results suggest potentially large welfare gains: one policy simulation shows crime reductions up to 24.7% with no change in jailing rates, or jailing rate reductions up to 41.9% with no increase in crime rates. Moreover, all categories of crime, including violent crimes, show reductions; and these gains can be achieved while simultaneously reducing racial disparities. These results suggest that while machine learning can be valuable, realizing this value requires integrating these tools into an economic framework: being clear about the link between predictions and decisions; specifying the scope of payoff functions; and constructing unbiased decision counterfactuals. JEL Codes: C10 (Econometric and statistical methods and methodology), C55 (Large datasets: Modeling and analysis), K40 (Legal procedure, the legal system, and illegal behavior).

17.
KDD ; 2017: 275-284, 2017 Aug.
Article in English | MEDLINE | ID: mdl-29780658

ABSTRACT

Evaluating whether machines improve on human performance is one of the central questions of machine learning. However, there are many domains where the data is selectively labeled in the sense that the observed outcomes are themselves a consequence of the existing choices of the human decision-makers. For instance, in the context of judicial bail decisions, we observe the outcome of whether a defendant fails to return for their court appearance only if the human judge decides to release the defendant on bail. This selective labeling makes it harder to evaluate predictive models as the instances for which outcomes are observed do not represent a random sample of the population. Here we propose a novel framework for evaluating the performance of predictive models on selectively labeled data. We develop an approach called contraction which allows us to compare the performance of predictive models and human decision-makers without resorting to counterfactual inference. Our methodology harnesses the heterogeneity of human decision-makers and facilitates effective evaluation of predictive models even in the presence of unmeasured confounders (unobservables) which influence both human decisions and the resulting outcomes. Experimental results on real world datasets spanning diverse domains such as health care, insurance, and criminal justice demonstrate the utility of our evaluation metric in comparing human decisions and machine predictions.

18.
Proc Natl Acad Sci U S A ; 114(1): 33-38, 2017 01 03.
Article in English | MEDLINE | ID: mdl-27999183

ABSTRACT

Methods for ranking the importance of nodes in a network have a rich history in machine learning and across domains that analyze structured data. Recent work has evaluated these methods through the "seed set expansion problem": given a subset [Formula: see text] of nodes from a community of interest in an underlying graph, can we reliably identify the rest of the community? We start from the observation that the most widely used techniques for this problem, personalized PageRank and heat kernel methods, operate in the space of "landing probabilities" of a random walk rooted at the seed set, ranking nodes according to weighted sums of landing probabilities of different length walks. Both schemes, however, lack an a priori relationship to the seed set objective. In this work, we develop a principled framework for evaluating ranking methods by studying seed set expansion applied to the stochastic block model. We derive the optimal gradient for separating the landing probabilities of two classes in a stochastic block model and find, surprisingly, that under reasonable assumptions the gradient is asymptotically equivalent to personalized PageRank for a specific choice of the PageRank parameter [Formula: see text] that depends on the block model parameters. This connection provides a formal motivation for the success of personalized PageRank in seed set expansion and node ranking generally. We use this connection to propose more advanced techniques incorporating higher moments of landing probabilities; our advanced methods exhibit greatly improved performance, despite being simple linear classification rules, and are even competitive with belief propagation.

19.
Am Econ Rev ; 105(5): 491-495, 2015 May.
Article in English | MEDLINE | ID: mdl-27199498
20.
Philos Trans A Math Phys Eng Sci ; 371(1987): 20120378, 2013 Mar 28.
Article in English | MEDLINE | ID: mdl-23419847

ABSTRACT

The growth of the Web has required us to think about the design of information systems in which large-scale computational and social feedback effects are simultaneously at work. At the same time, the data generated by Web-scale systems--recording the ways in which millions of participants create content, link information, form groups and communicate with one another--have made it possible to evaluate long-standing theories of social interaction, and to formulate new theories based on what we observe. These developments have created a new level of interaction between computing and the social sciences, enriching the perspectives of both of these disciplines. We discuss some of the observations, theories and conclusions that have grown from the study of Web-scale social interaction, focusing on issues including the mechanisms by which people join groups, the ways in which different groups are linked together in social networks and the interplay of positive and negative interactions in these networks.

SELECTION OF CITATIONS
SEARCH DETAIL
...