Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 18 de 18
Filter
Add more filters










Publication year range
1.
Front Psychol ; 14: 1276285, 2023.
Article in English | MEDLINE | ID: mdl-38314252

ABSTRACT

Diurnal variations in indicators of emotion have been reliably observed in Twitter content, but confirmation of their circadian nature has not been possible due to the many confounding factors present in the data. We report on correlations between those indicators in Twitter content obtained from 9 cities of Italy and 54 cities in the United Kingdom, sampled hourly at the time of the 2020 national lockdowns. This experimental setting aims at minimizing synchronization effects related to television, eating habits, or other cultural factors. This correlation supports a circadian origin for these diurnal variations, although it does not exclude the possibility that similar zeitgebers exist in both countries including during lockdowns.

2.
PLoS One ; 16(6): e0251559, 2021.
Article in English | MEDLINE | ID: mdl-34061875

ABSTRACT

In Western societies, the stereotype prevails that pink is for girls and blue is for boys. A third possible gendered colour is red. While liked by women, it represents power, stereotypically a masculine characteristic. Empirical studies confirmed such gendered connotations when testing colour-emotion associations or colour preferences in males and females. Furthermore, empirical studies demonstrated that pink is a positive colour, blue is mainly a positive colour, and red is both a positive and a negative colour. Here, we assessed if the same valence and gender connotations appear in widely available written texts (Wikipedia and newswire articles). Using a word embedding method (GloVe), we extracted gender and valence biases for blue, pink, and red, as well as for the remaining basic colour terms from a large English-language corpus containing six billion words. We found and confirmed that pink was biased towards femininity and positivity, and blue was biased towards positivity. We found no strong gender bias for blue, and no strong gender or valence biases for red. For the remaining colour terms, we only found that green, white, and brown were positively biased. Our finding on pink shows that writers of widely available English texts use this colour term to convey femininity. This gendered communication reinforces the notion that results from research studies find their analogue in real word phenomena. Other findings were either consistent or inconsistent with results from research studies. We argue that widely available written texts have biases on their own, because they have been filtered according to context, time, and what is appropriate to be reported.


Subject(s)
Language , Sexism , Color , Female , Humans , Male , Young Adult
3.
Chronobiol Int ; 38(11): 1591-1610, 2021 11.
Article in English | MEDLINE | ID: mdl-34134583

ABSTRACT

Diurnal variation in psychometric indicators of emotion found in Twitter content has been known for many years. The degree to which this pattern depends upon different environmental zeitgebers has been difficult to determine. The nationwide lockdown in the United Kingdom in spring 2020 provided a unique government-mandated experiment to observe the temporal variation of psychometric indicators in the absence of certain specific social rhythms related to commuting and workplace social activities as well as many normal home-based social activities. We therefore analyzed the aggregated Twitter content of 54 UK cities in the 9 weeks of complete lockdown, comparing them with the 10 weeks that preceded them (as well as with the corresponding weeks of 2019). We observed that the key indicators of emotion retained their diurnal behavior. This suggests that even during lockdown there are still sufficient zeitgebers to maintain this diurnal variation in indicators of emotion.


Subject(s)
Social Media , Circadian Rhythm , Cities , Emotions , Humans , Seasons
4.
Br J Psychiatry ; 215(2): 481-484, 2019 08.
Article in English | MEDLINE | ID: mdl-30924435

ABSTRACT

The state of an individual's mental health depends on many factors. Determination of the importance of any particular factor within a population needs access to unbiased data. We used publicly available data-sets to investigate, at a population level, how surrogates of mental health covary with light exposure. We found strong seasonal patterns of antidepressant prescriptions, which show stronger correlations with day length than levels of solar energy. Levels of depression in a population can therefore be determined by proxy indicators such as web query logs. Furthermore, these proxies for depression correlate with day length rather than solar energy.Declaration of interestNone.


Subject(s)
Antidepressive Agents/therapeutic use , Drug Prescriptions/statistics & numerical data , Search Engine/statistics & numerical data , Seasonal Affective Disorder/drug therapy , Seasons , Sunlight , Humans , Mental Health , Search Engine/trends , United Kingdom
5.
PLoS One ; 13(6): e0197002, 2018.
Article in English | MEDLINE | ID: mdl-29924814

ABSTRACT

The psychological state of a person is characterised by cognitive and emotional variables which can be inferred by psychometric methods. Using the word lists from the Linguistic Inquiry and Word Count, designed to infer a range of psychological states from the word usage of a person, we studied temporal changes in the average expression of psychological traits in the general population. We sampled the contents of Twitter in the United Kingdom at hourly intervals for a period of four years, revealing a strong diurnal rhythm in most of the psychometric variables, and finding that two independent factors can explain 85% of the variance across their 24-h profiles. The first has peak expression time starting at 5am/6am, it correlates with measures of analytical thinking, with the language of drive (e.g power, and achievement), and personal concerns. It is anticorrelated with the language of negative affect and social concerns. The second factor has peak expression time starting at 3am/4am, it correlates with the language of existential concerns, and anticorrelates with expression of positive emotions. Overall, we see strong evidence that our language changes dramatically between night and day, reflecting changes in our concerns and underlying cognitive and emotional processes. These shifts occur at times associated with major changes in neural activity and hormonal levels.


Subject(s)
Circadian Rhythm , Cognition , Emotions , Models, Psychological , Social Media , Female , Humans , Male , Psychometrics , United Kingdom
6.
Minds Mach (Dordr) ; 28(4): 735-774, 2018.
Article in English | MEDLINE | ID: mdl-30930542

ABSTRACT

Interactions between an intelligent software agent (ISA) and a human user are ubiquitous in everyday situations such as access to information, entertainment, and purchases. In such interactions, the ISA mediates the user's access to the content, or controls some other aspect of the user experience, and is not designed to be neutral about outcomes of user choices. Like human users, ISAs are driven by goals, make autonomous decisions, and can learn from experience. Using ideas from bounded rationality (and deploying concepts from artificial intelligence, behavioural economics, control theory, and game theory), we frame these interactions as instances of an ISA whose reward depends on actions performed by the user. Such agents benefit by steering the user's behaviour towards outcomes that maximise the ISA's utility, which may or may not be aligned with that of the user. Video games, news recommendation aggregation engines, and fitness trackers can all be instances of this general case. Our analysis facilitates distinguishing various subcases of interaction (i.e. deception, coercion, trading, and nudging), as well as second-order effects that might include the possibility for adaptive interfaces to induce behavioural addiction, and/or change in user belief. We present these types of interaction within a conceptual framework, and review current examples of persuasive technologies and the issues that arise from their use. We argue that the nature of the feedback commonly used by learning agents to update their models and subsequent decisions could steer the behaviour of human users away from what benefits them, and in a direction that can undermine autonomy and cause further disparity between actions and goals as exemplified by addictive and compulsive behaviour. We discuss some of the ethical, social and legal implications of this technology and argue that it can sometimes exploit and reinforce weaknesses in human beings.

7.
Brain Neurosci Adv ; 1: 2398212817744501, 2017 Jan 01.
Article in English | MEDLINE | ID: mdl-29270466

ABSTRACT

BACKGROUND: Circadian regulation of sleep, cognition, and metabolic state is driven by a central clock, which is in turn entrained by environmental signals. Understanding the circadian regulation of mood, which is vital for coping with day-to-day needs, requires large datasets and has classically utilised subjective reporting. METHODS: In this study, we use a massive dataset of over 800 million Twitter messages collected over 4 years in the United Kingdom. We extract robust signals of the changes that happened during the course of the day in the collective expression of emotions and fatigue. We use methods of statistical analysis and Fourier analysis to identify periodic structures, extrema, change-points, and compare the stability of these events across seasons and weekends. RESULTS: We reveal strong, but different, circadian patterns for positive and negative moods. The cycles of fatigue and anger appear remarkably stable across seasons and weekend/weekday boundaries. Positive mood and sadness interact more in response to these changing conditions. Anger and, to a lower extent, fatigue show a pattern that inversely mirrors the known circadian variation of plasma cortisol concentrations. Most quantities show a strong inflexion in the morning. CONCLUSION: Since circadian rhythm and sleep disorders have been reported across the whole spectrum of mood disorders, we suggest that analysis of social media could provide a valuable resource to the understanding of mental disorder.

8.
Proc Natl Acad Sci U S A ; 114(4): E457-E465, 2017 01 24.
Article in English | MEDLINE | ID: mdl-28069962

ABSTRACT

Previous studies have shown that it is possible to detect macroscopic patterns of cultural change over periods of centuries by analyzing large textual time series, specifically digitized books. This method promises to empower scholars with a quantitative and data-driven tool to study culture and society, but its power has been limited by the use of data from books and simple analytics based essentially on word counts. This study addresses these problems by assembling a vast corpus of regional newspapers from the United Kingdom, incorporating very fine-grained geographical and temporal information that is not available for books. The corpus spans 150 years and is formed by millions of articles, representing 14% of all British regional outlets of the period. Simple content analysis of this corpus allowed us to detect specific events, like wars, epidemics, coronations, or conclaves, with high accuracy, whereas the use of more refined techniques from artificial intelligence enabled us to move beyond counting words by detecting references to named entities. These techniques allowed us to observe both a systematic underrepresentation and a steady increase of women in the news during the 20th century and the change of geographic focus for various concepts. We also estimate the dates when electricity overtook steam and trains overtook horses as a means of transportation, both around the year 1900, along with observing other cultural transitions. We believe that these data-driven approaches can complement the traditional method of close reading in detecting trends of continuity and change in historical corpora.

9.
PLoS One ; 11(11): e0165736, 2016.
Article in English | MEDLINE | ID: mdl-27824911

ABSTRACT

We address the problem of observing periodic changes in the behaviour of a large population, by analysing the daily contents of newspapers published in the United States and United Kingdom from 1836 to 1922. This is done by analysing the daily time series of the relative frequency of the 25K most frequent words for each country, resulting in the study of 50K time series for 31,755 days. Behaviours that are found to be strongly periodic include seasonal activities, such as hunting and harvesting. A strong connection with natural cycles is found, with a pronounced presence of fruits, vegetables, flowers and game. Periodicities dictated by religious or civil calendars are also detected and show a different wave-form than those provoked by weather. States that can be revealed include the presence of infectious disease, with clear annual peaks for fever, pneumonia and diarrhoea. Overall, 2% of the words are found to be strongly periodic, and the period most frequently found is 365 days. Comparisons between UK and US, and between modern and historical news, reveal how the fundamental cycles of life are shaped by the seasons, but also how this effect has been reduced in modern times.


Subject(s)
Behavior , Newspapers as Topic/statistics & numerical data , Periodicity , Fourier Analysis , History, 19th Century , History, 20th Century , Humans , Language , Newspapers as Topic/history , Seasons , United Kingdom , United States , Weather
10.
PLoS One ; 11(2): e0148434, 2016.
Article in English | MEDLINE | ID: mdl-26840432

ABSTRACT

Feminist news media researchers have long contended that masculine news values shape journalists' quotidian decisions about what is newsworthy. As a result, it is argued, topics and issues traditionally regarded as primarily of interest and relevance to women are routinely marginalised in the news, while men's views and voices are given privileged space. When women do show up in the news, it is often as "eye candy," thus reinforcing women's value as sources of visual pleasure rather than residing in the content of their views. To date, evidence to support such claims has tended to be based on small-scale, manual analyses of news content. In this article, we report on findings from our large-scale, data-driven study of gender representation in online English language news media. We analysed both words and images so as to give a broader picture of how gender is represented in online news. The corpus of news content examined consists of 2,353,652 articles collected over a period of six months from more than 950 different news outlets. From this initial dataset, we extracted 2,171,239 references to named persons and 1,376,824 images resolving the gender of names and faces using automated computational methods. We found that males were represented more often than females in both images and text, but in proportions that changed across topics, news outlets and mode. Moreover, the proportion of females was consistently higher in images than in text, for virtually all topics and news outlets; women were more likely to be represented visually than they were mentioned as a news actor or source. Our large-scale, data-driven analysis offers important empirical evidence of macroscopic patterns in news content concerning the way men and women are represented.


Subject(s)
Feminism , Internet , Mass Media , Female , Humans , Male , Newspapers as Topic
11.
PLoS One ; 5(12): e14243, 2010 Dec 08.
Article in English | MEDLINE | ID: mdl-21170383

ABSTRACT

BACKGROUND: A trend towards automation of scientific research has recently resulted in what has been termed "data-driven inquiry" in various disciplines, including physics and biology. The automation of many tasks has been identified as a possible future also for the humanities and the social sciences, particularly in those disciplines concerned with the analysis of text, due to the recent availability of millions of books and news articles in digital format. In the social sciences, the analysis of news media is done largely by hand and in a hypothesis-driven fashion: the scholar needs to formulate a very specific assumption about the patterns that might be in the data, and then set out to verify if they are present or not. METHODOLOGY/PRINCIPAL FINDINGS: In this study, we report what we think is the first large scale content-analysis of cross-linguistic text in the social sciences, by using various artificial intelligence techniques. We analyse 1.3 M news articles in 22 languages detecting a clear structure in the choice of stories covered by the various outlets. This is significantly affected by objective national, geographic, economic and cultural relations among outlets and countries, e.g., outlets from countries sharing strong economic ties are more likely to cover the same stories. We also show that the deviation from average content is significantly correlated with membership to the eurozone, as well as with the year of accession to the EU. CONCLUSIONS/SIGNIFICANCE: While independently making a multitude of small editorial decisions, the leading media of the 27 EU countries, over a period of six months, shaped the contents of the EU mediasphere in a way that reflects its deep geographic, economic and cultural relations. Detecting these subtle signals in a statistically rigorous way would be out of the reach of traditional methods. This analysis demonstrates the power of the available methods for significant automation of media content analysis.


Subject(s)
Culture , Data Collection , Mass Media , Automation , Books , European Union , Humans , Research , Social Sciences
12.
Neural Netw ; 23(4): 466-70, 2010 May.
Article in English | MEDLINE | ID: mdl-20211540

ABSTRACT

Statistical approaches to Artificial Intelligence are behind most success stories of the field in the past decade. The idea of generating non-trivial behaviour by analysing vast amounts of data has enabled recommendation systems, search engines, spam filters, optical character recognition, machine translation and speech recognition, among other things. As we celebrate the spectacular achievements of this line of research, we need to assess its full potential and its limitations. What are the next steps to take towards machine intelligence?


Subject(s)
Artificial Intelligence , Neural Networks, Computer , Algorithms , Congresses as Topic , Humans , Pattern Recognition, Automated , User-Computer Interface
13.
PLoS One ; 1: e85, 2006 Dec 20.
Article in English | MEDLINE | ID: mdl-17183716

ABSTRACT

Gene families are groups of homologous genes that are likely to have highly similar functions. Differences in family size due to lineage-specific gene duplication and gene loss may provide clues to the evolutionary forces that have shaped mammalian genomes. Here we analyze the gene families contained within the whole genomes of human, chimpanzee, mouse, rat, and dog. In total we find that more than half of the 9,990 families present in the mammalian common ancestor have either expanded or contracted along at least one lineage. Additionally, we find that a large number of families are completely lost from one or more mammalian genomes, and a similar number of gene families have arisen subsequent to the mammalian common ancestor. Along the lineage leading to modern humans we infer the gain of 689 genes and the loss of 86 genes since the split from chimpanzees, including changes likely driven by adaptive natural selection. Our results imply that humans and chimpanzees differ by at least 6% (1,418 of 22,000 genes) in their complement of genes, which stands in stark contrast to the oft-cited 1.5% difference between orthologous nucleotide sequences. This genomic "revolving door" of gene gain and loss represents a large number of genetic differences separating humans from our closest relatives.


Subject(s)
Biological Evolution , Mammals/genetics , Multigene Family , Animals , Dogs , Humans , Mice , Pan troglodytes/genetics , Phylogeny , Primates/genetics , Rats , Rodentia/genetics , Selection, Genetic
14.
Bioinformatics ; 22(10): 1269-71, 2006 May 15.
Article in English | MEDLINE | ID: mdl-16543274

ABSTRACT

SUMMARY: We present CAFE (Computational Analysis of gene Family Evolution), a tool for the statistical analysis of the evolution of the size of gene families. It uses a stochastic birth and death process to model the evolution of gene family sizes over a phylogeny. For a specified phylogenetic tree, and given the gene family sizes in the extant species, CAFE can estimate the global birth and death rate of gene families, infer the most likely gene family size at all internal nodes, identify gene families that have accelerated rates of gain and loss (quantified by a p-value) and identify which branches cause the p-value to be small for significant families. AVAILABILITY: Software is available from http://www.bio.indiana.edu/~hahnlab/Software.html


Subject(s)
Algorithms , Chromosome Mapping/methods , DNA Mutational Analysis/methods , Evolution, Molecular , Multigene Family/genetics , Software , Genetic Variation/genetics , Phylogeny , User-Computer Interface
15.
Genome Res ; 15(8): 1153-60, 2005 Aug.
Article in English | MEDLINE | ID: mdl-16077014

ABSTRACT

Comparison of whole genomes has revealed that changes in the size of gene families among organisms is quite common. However, there are as yet no models of gene family evolution that make it possible to estimate ancestral states or to infer upon which lineages gene families have contracted or expanded. In addition, large differences in family size have generally been attributed to the effects of natural selection, without a strong statistical basis for these conclusions. Here we use a model of stochastic birth and death for gene family evolution and show that it can be efficiently applied to multispecies genome comparisons. This model takes into account the lengths of branches on phylogenetic trees, as well as duplication and deletion rates, and hence provides expectations for divergence in gene family size among lineages. The model offers both the opportunity to identify large-scale patterns in genome evolution and the ability to make stronger inferences regarding the role of natural selection in gene family expansion or contraction. We apply our method to data from the genomes of five yeast species to show its applicability.


Subject(s)
Evolution, Molecular , Genomics/methods , Models, Genetic , Multigene Family/genetics , Likelihood Functions , Phylogeny , Saccharomyces/genetics , Stochastic Processes
16.
Neuroinformatics ; 3(2): 115-31, 2005.
Article in English | MEDLINE | ID: mdl-15988041

ABSTRACT

Generating informational thesauri that classify, cross-reference, and retrieve diverse and highly detailed neuroscientific information requires identifying related neuroanatomical terms and acronyms within and between species (Gorin et al., 2001) Manual construction of such informational thesauri is laborious, and we describe implementing and evaluating a neuroanatomical term and acronym reconciliation (NTAR) system to assist domain experts with this task. NTAR is composed of two modules. The neuroanatomical term extraction (NTE) module employs a hidden Markov model (HMM) in conjunction with lexical rules to extract neuroanatomical terms (NT) and acronyms (NA) from textual material. The output of the NTE is formatted into collections of term- or acronym-indexed documents composed of sentences and word phrases extracted from textual material. The second information retrieval (IR) module utilizes a vector space model (VSM) and includes a novel, automated relevance feedback algorithm. The IR module retrieves statistically related neuroanatomical terms and acronyms in response to queried neuroanatomical terms and acronyms. Neuroanatomical terms and acronyms retrieval obtained from term-based inquiries were compared with (1) term retrieval obtained by including automated relevance feedback and with (2) term retrieval using "document-to-document" comparisons (context-based VSM). The retrieval of synonymous and similar primate and macaque thalamic terms and acronyms in response to a query list of human thalamic terminology by these three IR approaches was compared against a previously published, manually constructed concordance table of homologous cross-species terms and acronyms. Term-based VSM with automated relevance feedback retrieved 70% and 80% of these primate and macaque terms and acronyms, respectively, listed in the concordance table. Automated feedback algorithm correctly identified 87% of the macaque terms and acronyms that were independently selected by a domain expert as being appropriate for manual relevance feedback. Context-based VSM correctly retrieved 97% and 98% of the primate and macaque terms and acronyms listed in the term homology table. These results indicate that the NTAR system could assist neuroscientists with thesauri creation for closely related, highly detailed neuroanatomical domains.


Subject(s)
Information Systems , Neuroanatomy/methods , Software , Terminology as Topic , Vocabulary, Controlled , Algorithms , Animals , Databases, Factual , Humans , Information Systems/instrumentation
17.
Pac Symp Biocomput ; : 483-94, 2005.
Article in English | MEDLINE | ID: mdl-15759653

ABSTRACT

We present a method for inference of transcriptional modules from heterogeneous data sources. It allows identifying the responsible set of regulators in combination with their corresponding DNA recognition sites (motifs) and target genes. Our approach distinguishes itself from previous work in literature because it fully exploits the knowledge of three independently acquired data sources: ChIP-chip data; motif information as obtained by phylogenetic shadowing; and gene expression profiles obtained using microarray experiments. Moreover, these three data sources are dealt with in a new and fully integrated manner. By avoiding approaches that take the different data sources into account sequentially or iteratively, the transparency of the method and the interpretability of the results are ensured. Using our method on biological data demonstrated the biological relevance of the inference.


Subject(s)
Oligonucleotide Array Sequence Analysis , Saccharomyces cerevisiae/genetics , Transcription, Genetic , Algorithms , Cell Cycle/genetics , Fungal Proteins/genetics , Models, Genetic , Reproducibility of Results , Ribosomes/genetics , Saccharomyces cerevisiae/cytology
18.
Bioinformatics ; 20(16): 2626-35, 2004 Nov 01.
Article in English | MEDLINE | ID: mdl-15130933

ABSTRACT

MOTIVATION: During the past decade, the new focus on genomics has highlighted a particular challenge: to integrate the different views of the genome that are provided by various types of experimental data. RESULTS: This paper describes a computational framework for integrating and drawing inferences from a collection of genome-wide measurements. Each dataset is represented via a kernel function, which defines generalized similarity relationships between pairs of entities, such as genes or proteins. The kernel representation is both flexible and efficient, and can be applied to many different types of data. Furthermore, kernel functions derived from different types of data can be combined in a straightforward fashion. Recent advances in the theory of kernel methods have provided efficient algorithms to perform such combinations in a way that minimizes a statistical loss function. These methods exploit semidefinite programming techniques to reduce the problem of finding optimizing kernel combinations to a convex optimization problem. Computational experiments performed using yeast genome-wide datasets, including amino acid sequences, hydropathy profiles, gene expression data and known protein-protein interactions, demonstrate the utility of this approach. A statistical learning algorithm trained from all of these data to recognize particular classes of proteins--membrane proteins and ribosomal proteins--performs significantly better than the same algorithm trained on any single type of data. AVAILABILITY: Supplementary data at http://noble.gs.washington.edu/proj/sdp-svm


Subject(s)
Algorithms , Chromosome Mapping/methods , Databases, Protein , Gene Expression Profiling/methods , Models, Genetic , Proteins/genetics , Sequence Analysis, Protein/methods , Artificial Intelligence , Databases, Genetic , Fungal Proteins/chemistry , Fungal Proteins/genetics , Genomics/methods , Information Storage and Retrieval/methods , Membrane Proteins/genetics , Membrane Proteins/metabolism , Models, Statistical , Pattern Recognition, Automated , Proteins/analysis , Proteins/chemistry , Proteins/classification , Ribosomal Proteins/chemistry , Ribosomal Proteins/genetics , Sequence Alignment , Sequence Homology, Amino Acid , Systems Integration
SELECTION OF CITATIONS
SEARCH DETAIL
...