Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 20 de 54
Filter
3.
PLoS One ; 17(3): e0264713, 2022.
Article in English | MEDLINE | ID: covidwho-1745319

ABSTRACT

In most big cities, public transports are enclosed and crowded spaces. Therefore, they are considered as one of the most important triggers of COVID-19 spread. Most of the existing research related to the mobility of people and COVID-19 spread is focused on investigating highly frequented paths by analyzing data collected from mobile devices, which mainly refer to geo-positioning records. In contrast, this paper tackles the problem by studying mass mobility. The relations between daily mobility on public transport (subway or metro) in three big cities and mortality due to COVID-19 are investigated. Data collected for these purposes come from official sources, such as the web pages of the cities' local governments. To provide a systematic framework, we applied the IBM Foundational Methodology for Data Science to the epidemiological domain of this paper. Our analysis consists of moving averages with a moving window equal to seven days so as to avoid bias due to weekly tendencies. Among the main findings of this work are: a) New York City and Madrid show similar distribution on studied variables, which resemble a Gauss bell, in contrast to Mexico City, and b) Non-pharmaceutical interventions don't bring immediate results, and reductions to the number of deaths due to COVID are observed after a certain number of days. This paper yields partial evidence for assessing the effectiveness of public policies in mitigating the COVID-19 pandemic.


Subject(s)
COVID-19/mortality , Transportation , Adult , COVID-19/epidemiology , Cities/epidemiology , Cities/statistics & numerical data , Data Science/methods , Humans , Mexico/epidemiology , New York City/epidemiology , Spain/epidemiology , Transportation/methods , Transportation/statistics & numerical data
4.
Int J Environ Res Public Health ; 19(6)2022 03 16.
Article in English | MEDLINE | ID: covidwho-1742469

ABSTRACT

Data science is an interdisciplinary field that applies numerous techniques, such as machine learning (ML), neural networks (NN) and artificial intelligence (AI), to create value, based on extracting knowledge and insights from available 'big' data [...].


Subject(s)
COVID-19 , Data Science , Artificial Intelligence , COVID-19/epidemiology , Delivery of Health Care , Health Facilities , Humans
5.
Inj Prev ; 28(1): 74-80, 2022 02.
Article in English | MEDLINE | ID: covidwho-1642894

ABSTRACT

OBJECTIVE: The purpose of this research is to identify how data science is applied in suicide prevention literature, describe the current landscape of this literature and highlight areas where data science may be useful for future injury prevention research. DESIGN: We conducted a literature review of injury prevention and data science in April 2020 and January 2021 in three databases. METHODS: For the included 99 articles, we extracted the following: (1) author(s) and year; (2) title; (3) study approach (4) reason for applying data science method; (5) data science method type; (6) study description; (7) data source and (8) focus on a disproportionately affected population. RESULTS: Results showed the literature on data science and suicide more than doubled from 2019 to 2020, with articles with individual-level approaches more prevalent than population-level approaches. Most population-level articles applied data science methods to describe (n=10) outcomes, while most individual-level articles identified risk factors (n=27). Machine learning was the most common data science method applied in the studies (n=48). A wide array of data sources was used for suicide research, with most articles (n=45) using social media and web-based behaviour data. Eleven studies demonstrated the value of applying data science to suicide prevention literature for disproportionately affected groups. CONCLUSION: Data science techniques proved to be effective tools in describing suicidal thoughts or behaviour, identifying individual risk factors and predicting outcomes. Future research should focus on identifying how data science can be applied in other injury-related topics.


Subject(s)
Data Science , Suicide , Health Services Research , Humans , Risk Factors , Suicidal Ideation , Suicide/prevention & control
6.
BMC Bioinformatics ; 22(1): 607, 2021 Dec 20.
Article in English | MEDLINE | ID: covidwho-1633689

ABSTRACT

BACKGROUND: Biomolecular interactions that modulate biological processes occur mainly in cavities throughout the surface of biomolecular structures. In the data science era, structural biology has benefited from the increasing availability of biostructural data due to advances in structural determination and computational methods. In this scenario, data-intensive cavity analysis demands efficient scripting routines built on easily manipulated data structures. To fulfill this need, we developed pyKVFinder, a Python package to detect and characterize cavities in biomolecular structures for data science and automated pipelines. RESULTS: pyKVFinder efficiently detects cavities in biomolecular structures and computes their volume, area, depth and hydropathy, storing these cavity properties in NumPy arrays. Benefited from Python ecosystem interoperability and data structures, pyKVFinder can be integrated with third-party scientific packages and libraries for mathematical calculations, machine learning and 3D visualization in automated workflows. As proof of pyKVFinder's capabilities, we successfully identified and compared ADRP substrate-binding site of SARS-CoV-2 and a set of homologous proteins with pyKVFinder, showing its integrability with data science packages such as matplotlib, NGL Viewer, SciPy and Jupyter notebook. CONCLUSIONS: We introduce an efficient, highly versatile and easily integrable software for detecting and characterizing biomolecular cavities in data science applications and automated protocols. pyKVFinder facilitates biostructural data analysis with scripting routines in the Python ecosystem and can be building blocks for data science and drug design applications.


Subject(s)
COVID-19 , Data Science , Data Analysis , Ecosystem , Humans , SARS-CoV-2
7.
JCO Clin Cancer Inform ; 5: 881-896, 2021 08.
Article in English | MEDLINE | ID: covidwho-1551280

ABSTRACT

Cancer Informatics for Cancer Centers (CI4CC) is a grassroots, nonprofit 501c3 organization intended to provide a focused national forum for engagement of senior cancer informatics leaders, primarily aimed at academic cancer centers anywhere in the world but with a special emphasis on the 70 National Cancer Institute-funded cancer centers. This consortium has regularly held topic-focused biannual face-to-face symposiums. These meetings are a place to review cancer informatics and data science priorities and initiatives, providing a forum for discussion of the strategic and pragmatic issues that we faced at our respective institutions and cancer centers. Here, we provide meeting highlights from the latest CI4CC Symposium, which was delayed from its original April 2020 schedule because of the COVID-19 pandemic and held virtually over three days (September 24, October 1, and October 8) in the fall of 2020. In addition to the content presented, we found that holding this event virtually once a week for 6 hours was a great way to keep the kind of deep engagement that a face-to-face meeting engenders. This is the second such publication of CI4CC Symposium highlights, the first covering the meeting that took place in Napa, California, from October 14-16, 2019. We conclude with some thoughts about using data science to learn from every child with cancer, focusing on emerging activities of the National Cancer Institute's Childhood Cancer Data Initiative.


Subject(s)
COVID-19 , Medical Informatics , Neoplasms , Adolescent , Child , Data Science , Humans , Neoplasms/epidemiology , Neoplasms/therapy , Pandemics , SARS-CoV-2 , Young Adult
8.
Drug Discov Today ; 26(11): 2515-2526, 2021 11.
Article in English | MEDLINE | ID: covidwho-1540581

ABSTRACT

Over the past few decades, the number of health and 'omics-related data' generated and stored has grown exponentially. Patient information can be collected in real time and explored using various artificial intelligence (AI) tools in clinical trials; mobile devices can also be used to improve aspects of both the diagnosis and treatment of diseases. In addition, AI can be used in the development of new drugs or for drug repurposing, in faster diagnosis and more efficient treatment for various diseases, as well as to identify data-driven hypotheses for scientists. In this review, we discuss how AI is starting to revolutionize the life sciences sector.


Subject(s)
Artificial Intelligence , Biological Science Disciplines , Biotechnology , Clinical Trials as Topic , Data Science , Drug Design , Drug Development , Electronic Health Records , Humans , Mobile Applications , Natural Language Processing , Pharmacology , Publishing
9.
Philos Trans A Math Phys Eng Sci ; 380(2214): 20210127, 2022 Jan 10.
Article in English | MEDLINE | ID: covidwho-1528263

ABSTRACT

During the COVID-19 pandemic, more than ever, data science has become a powerful weapon in combating an infectious disease epidemic and arguably any future infectious disease epidemic. Computer scientists, data scientists, physicists and mathematicians have joined public health professionals and virologists to confront the largest pandemic in the century by capitalizing on the large-scale 'big data' generated and harnessed for combating the COVID-19 pandemic. In this paper, we review the newly born data science approaches to confronting COVID-19, including the estimation of epidemiological parameters, digital contact tracing, diagnosis, policy-making, resource allocation, risk assessment, mental health surveillance, social media analytics, drug repurposing and drug development. We compare the new approaches with conventional epidemiological studies, discuss lessons we learned from the COVID-19 pandemic, and highlight opportunities and challenges of data science approaches to confronting future infectious disease epidemics. This article is part of the theme issue 'Data science approaches to infectious disease surveillance'.


Subject(s)
COVID-19 , Pandemics , Contact Tracing , Data Science , Humans , Pandemics/prevention & control , SARS-CoV-2
10.
Philos Trans A Math Phys Eng Sci ; 380(2214): 20210115, 2022 Jan 10.
Article in English | MEDLINE | ID: covidwho-1528254

ABSTRACT

Novel data science approaches are needed to confront large-scale infectious disease epidemics such as COVID-19, human immunodeficiency viruses, African swine flu and Ebola. Human beings are now equipped with richer data and more advanced data analytics methodologies, many of which have become available only in the last decade. The theme issue Data Science Approaches to Infectious Diseases Surveillance reports the latest interdisciplinary research on developing novel data science methodologies to capitalize on the rich 'big data' of human behaviours to confront infectious diseases, with a particular focus on combating the ongoing COVID-19 pandemic. Compared to conventional public health research, articles in this issue present innovative data science approaches that were not possible without the growing human behaviour data and the recent advances in information and communications technology. This issue has 12 research papers and one review paper from a strong lineup of contributors from multiple disciplines, including data science, computer science, computational social sciences, applied maths, statistics, physics and public health. This introductory article provides a brief overview of the issue and discusses the future of this emerging field. This article is part of the theme issue 'Data science approaches to infectious disease surveillance'.


Subject(s)
COVID-19 , Communicable Diseases , Communicable Diseases/epidemiology , Data Science , Humans , Pandemics , SARS-CoV-2
11.
Sci Data ; 8(1): 297, 2021 11 22.
Article in English | MEDLINE | ID: covidwho-1528020

ABSTRACT

The Covid Symptom Study, a smartphone-based surveillance study on COVID-19 symptoms in the population, is an exemplar of big data citizen science. As of May 23rd, 2021, over 5 million participants have collectively logged over 360 million self-assessment reports since its introduction in March 2020. The success of the Covid Symptom Study creates significant technical challenges around effective data curation. The primary issue is scale. The size of the dataset means that it can no longer be readily processed using standard Python-based data analytics software such as Pandas on commodity hardware. Alternative technologies exist but carry a higher technical complexity and are less accessible to many researchers. We present ExeTera, a Python-based open source software package designed to provide Pandas-like data analytics on datasets that approach terabyte scales. We present its design and capabilities, and show how it is a critical component of a data curation pipeline that enables reproducible research across an international research group for the Covid Symptom Study.


Subject(s)
COVID-19/epidemiology , Citizen Science , Data Curation , Big Data , Data Science , Datasets as Topic , Epidemiological Monitoring , Humans , Mobile Applications , Smartphone , Software
12.
Per Med ; 18(6): 573-582, 2021 09.
Article in English | MEDLINE | ID: covidwho-1456228

ABSTRACT

Advancing frontiers of clinical research, we discuss the need for intelligent health systems to support a deeper investigation of COVID-19. We hypothesize that the convergence of the healthcare data and staggering developments in artificial intelligence have the potential to elevate the recovery process with diagnostic and predictive analysis to identify major causes of mortality, modifiable risk factors and actionable information that supports the early detection and prevention of COVID-19. However, current constraints include the recruitment of COVID-19 patients for research; translational integration of electronic health records and diversified public datasets; and the development of artificial intelligence systems for data-intensive computational modeling to assist clinical decision making. We propose a novel nexus of machine learning algorithms to examine COVID-19 data granularity from population studies to subgroups stratification and ensure best modeling strategies within the data continuum.


Subject(s)
COVID-19/therapy , Precision Medicine/methods , Algorithms , Artificial Intelligence/trends , Data Analysis , Data Science/trends , Delivery of Health Care , Electronic Health Records , Humans , Machine Learning , SARS-CoV-2/genetics , SARS-CoV-2/pathogenicity
13.
Nat Commun ; 12(1): 5757, 2021 10 01.
Article in English | MEDLINE | ID: covidwho-1447304

ABSTRACT

The large amount of biomedical data derived from wearable sensors, electronic health records, and molecular profiling (e.g., genomics data) is rapidly transforming our healthcare systems. The increasing scale and scope of biomedical data not only is generating enormous opportunities for improving health outcomes but also raises new challenges ranging from data acquisition and storage to data analysis and utilization. To meet these challenges, we developed the Personal Health Dashboard (PHD), which utilizes state-of-the-art security and scalability technologies to provide an end-to-end solution for big biomedical data analytics. The PHD platform is an open-source software framework that can be easily configured and deployed to any big data health project to store, organize, and process complex biomedical data sets, support real-time data analysis at both the individual level and the cohort level, and ensure participant privacy at every step. In addition to presenting the system, we illustrate the use of the PHD framework for large-scale applications in emerging multi-omics disease studies, such as collecting and visualization of diverse data types (wearable, clinical, omics) at a personal level, investigation of insulin resistance, and an infrastructure for the detection of presymptomatic COVID-19.


Subject(s)
Data Science/methods , Medical Records Systems, Computerized , Big Data , Computer Security , Data Analysis , Health Information Interoperability , Humans , Information Storage and Retrieval , Software
14.
PLoS Biol ; 19(9): e3001398, 2021 09.
Article in English | MEDLINE | ID: covidwho-1440978

ABSTRACT

Hypothesis generation in observational, biomedical data science often starts with computing an association or identifying the statistical relationship between a dependent and an independent variable. However, the outcome of this process depends fundamentally on modeling strategy, with differing strategies generating what can be called "vibration of effects" (VoE). VoE is defined by variation in associations that often lead to contradictory results. Here, we present a computational tool capable of modeling VoE in biomedical data by fitting millions of different models and comparing their output. We execute a VoE analysis on a series of widely reported associations (e.g., carrot intake associated with eyesight) with an extended additional focus on lifestyle exposures (e.g., physical activity) and components of the Framingham Risk Score for cardiovascular health (e.g., blood pressure). We leveraged our tool for potential confounder identification, investigating what adjusting variables are responsible for conflicting models. We propose modeling VoE as a critical step in navigating discovery in observational data, discerning robust associations, and cataloging adjusting variables that impact model output.


Subject(s)
Data Science/methods , Models, Statistical , Observational Studies as Topic/statistics & numerical data , Epidemiologic Methods , Humans
15.
Health Policy Plan ; 37(1): 100-111, 2022 Jan 13.
Article in English | MEDLINE | ID: covidwho-1345730

ABSTRACT

We used big data analytics for exploring the relationship between government response policies, human mobility trends and numbers of coronavirus disease 2019 (COVID-19) cases comparatively in Poland, Turkey and South Korea. We collected daily mobility data of retail and recreation, grocery and pharmacy, parks, transit stations, workplaces, and residential areas. For quantifying the actions taken by governments and making a fairness comparison between these countries, we used stringency index values measured with the 'Oxford COVID-19 government response tracker'. For the Turkey case, we also developed a model by implementing the multilayer perceptron algorithm for predicting numbers of cases based on the mobility data. We finally created scenarios based on the descriptive statistics of the mobility data of these countries and generated predictions on the numbers of cases by using the developed model. Based on the descriptive analysis, we pointed out that while Poland and Turkey had relatively closer values and distributions on the study variables, South Korea had more stable data compared to Poland and Turkey. We mainly showed that while the stringency index of the current day was associated with mobility data of the same day, the current day's mobility was associated with the numbers of cases 1 month later. By obtaining 89.3% prediction accuracy, we also concluded that the use of mobility data and implementation of big data analytics technique may enable decision-making in managing uncertain environments created by outbreak situations. We finally proposed implications for policymakers for deciding on the targeted levels of mobility to maintain numbers of cases in a manageable range based on the results of created scenarios.


Subject(s)
COVID-19 , Data Science , Government , Humans , Poland/epidemiology , Policy , SARS-CoV-2 , Turkey
16.
Brief Bioinform ; 22(2): 855-872, 2021 03 22.
Article in English | MEDLINE | ID: covidwho-1343655

ABSTRACT

MOTIVATION: The outbreak of novel severe acute respiratory syndrome coronavirus (SARS-CoV-2, also known as COVID-19) in Wuhan has attracted worldwide attention. SARS-CoV-2 causes severe inflammation, which can be fatal. Consequently, there has been a massive and rapid growth in research aimed at throwing light on the mechanisms of infection and the progression of the disease. With regard to this data science is playing a pivotal role in in silico analysis to gain insights into SARS-CoV-2 and the outbreak of COVID-19 in order to forecast, diagnose and come up with a drug to tackle the virus. The availability of large multiomics, radiological, bio-molecular and medical datasets requires the development of novel exploratory and predictive models, or the customisation of existing ones in order to fit the current problem. The high number of approaches generates the need for surveys to guide data scientists and medical practitioners in selecting the right tools to manage their clinical data. RESULTS: Focusing on data science methodologies, we conduct a detailed study on the state-of-the-art of works tackling the current pandemic scenario. We consider various current COVID-19 data analytic domains such as phylogenetic analysis, SARS-CoV-2 genome identification, protein structure prediction, host-viral protein interactomics, clinical imaging, epidemiological research and drug discovery. We highlight data types and instances, their generation pipelines and the data science models currently in use. The current study should give a detailed sketch of the road map towards handling COVID-19 like situations by leveraging data science experts in choosing the right tools. We also summarise our review focusing on prime challenges and possible future research directions. CONTACT: hguzzi@unicz.it, sroy01@cus.ac.in.


Subject(s)
Antiviral Agents/therapeutic use , COVID-19/drug therapy , Data Science , Drug Repositioning , COVID-19/pathology , COVID-19/virology , Humans , SARS-CoV-2/isolation & purification
18.
Comput Methods Programs Biomed ; 205: 106083, 2021 Jun.
Article in English | MEDLINE | ID: covidwho-1261871

ABSTRACT

BACKGROUND: After two months of implementing a partial lockdown, the Indonesian government had announced the "New Normal" policy to prevent a further economic crash in the country. This policy received many critics, as Indonesia still experiencing a fluctuated number of infected cases. Understanding public perception through effective risk communication can assist the government in relaying an appropriate message to improve people's compliance and to avoid further disease spread. OBJECTIVE: This study observed how risk communication using social media platforms like Twitter could be adopted to measure public attention on COVID-19 related issues "New Normal". METHOD: From May 21 to June 18, 2020, we archived all tweets related to COVID-19 containing keywords: "#NewNormal", and "New Normal" using Drone Emprit Academy (DEA) engine. DEA search API collected all requested tweets and described the cumulative tweets for trend analysis, word segmentation, and word frequency. We further analyzed the public perception using sentiment analysis and identified the predominant tweets using emotion analysis. RESULT: We collected 284,216 tweets from 137,057 active users. From the trend analysis, we observed three stages of the changing trend of the public's attention on the "New Normal". Results from the sentiment analysis indicate that more than half of the population (52%) had a "positive" sentiment towards the "New Normal" issues while only 41% of them had a "negative" perception. Our study also demonstrated the public's sentiment trend has gradually shifted from "negative" to "positive" due to the influence of both the government actions and the spread of the disease. A more detailed analysis of the emotion analysis showed that the majority of the public emotions (77.6%) relied on the emotion of "trust", "anticipation", and "joy". Meanwhile, people were also surprised (8.62%) that the Indonesian government progressed to the "New Normal" concept despite a fluctuating number of cases. CONCLUSION: Our findings offer an opportunity for the government to use Twitter in the process of quick decision-making and policy evaluation during uncertain times in response to the COVID-19 pandemic.


Subject(s)
COVID-19 , Social Media , Attention , Communicable Disease Control , Communication , Data Science , Disease Outbreaks , Humans , Indonesia/epidemiology , Pandemics , SARS-CoV-2
19.
PLoS One ; 16(5): e0252147, 2021.
Article in English | MEDLINE | ID: covidwho-1238775

ABSTRACT

BACKGROUND: The WHO announced the epidemic of SARS-CoV2 as a public health emergency of international concern on 30th January 2020. To date, it has spread to more than 200 countries and has been declared a global pandemic. For appropriate preparedness, containment, and mitigation response, the stakeholders and policymakers require prior guidance on the propagation of SARS-CoV2. METHODOLOGY: This study aims to provide such guidance by forecasting the cumulative COVID-19 cases up to 4 weeks ahead for 187 countries, using four data-driven methodologies; autoregressive integrated moving average (ARIMA), exponential smoothing model (ETS), and random walk forecasts (RWF) with and without drift. For these forecasts, we evaluate the accuracy and systematic errors using the Mean Absolute Percentage Error (MAPE) and Mean Absolute Error (MAE), respectively. FINDINGS: The results show that the ARIMA and ETS methods outperform the other two forecasting methods. Additionally, using these forecasts, we generate heat maps to provide a pictorial representation of the countries at risk of having an increase in the cases in the coming 4 weeks of February 2021. CONCLUSION: Due to limited data availability during the ongoing pandemic, less data-hungry short-term forecasting models, like ARIMA and ETS, can help in anticipating the future outbreaks of SARS-CoV2.


Subject(s)
COVID-19/epidemiology , Data Science/methods , Models, Statistical , Data Science/standards , Humans , Practice Guidelines as Topic , Software/standards
20.
Comput Methods Programs Biomed ; 205: 106083, 2021 Jun.
Article in English | MEDLINE | ID: covidwho-1203004

ABSTRACT

BACKGROUND: After two months of implementing a partial lockdown, the Indonesian government had announced the "New Normal" policy to prevent a further economic crash in the country. This policy received many critics, as Indonesia still experiencing a fluctuated number of infected cases. Understanding public perception through effective risk communication can assist the government in relaying an appropriate message to improve people's compliance and to avoid further disease spread. OBJECTIVE: This study observed how risk communication using social media platforms like Twitter could be adopted to measure public attention on COVID-19 related issues "New Normal". METHOD: From May 21 to June 18, 2020, we archived all tweets related to COVID-19 containing keywords: "#NewNormal", and "New Normal" using Drone Emprit Academy (DEA) engine. DEA search API collected all requested tweets and described the cumulative tweets for trend analysis, word segmentation, and word frequency. We further analyzed the public perception using sentiment analysis and identified the predominant tweets using emotion analysis. RESULT: We collected 284,216 tweets from 137,057 active users. From the trend analysis, we observed three stages of the changing trend of the public's attention on the "New Normal". Results from the sentiment analysis indicate that more than half of the population (52%) had a "positive" sentiment towards the "New Normal" issues while only 41% of them had a "negative" perception. Our study also demonstrated the public's sentiment trend has gradually shifted from "negative" to "positive" due to the influence of both the government actions and the spread of the disease. A more detailed analysis of the emotion analysis showed that the majority of the public emotions (77.6%) relied on the emotion of "trust", "anticipation", and "joy". Meanwhile, people were also surprised (8.62%) that the Indonesian government progressed to the "New Normal" concept despite a fluctuating number of cases. CONCLUSION: Our findings offer an opportunity for the government to use Twitter in the process of quick decision-making and policy evaluation during uncertain times in response to the COVID-19 pandemic.


Subject(s)
COVID-19 , Social Media , Attention , Communicable Disease Control , Communication , Data Science , Disease Outbreaks , Humans , Indonesia/epidemiology , Pandemics , SARS-CoV-2
SELECTION OF CITATIONS
SEARCH DETAIL