Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 60
Filter
1.
J Cheminform ; 16(1): 19, 2024 Feb 20.
Article in English | MEDLINE | ID: mdl-38378618

ABSTRACT

The rapid increase of publicly available chemical structures and associated experimental data presents a valuable opportunity to build robust QSAR models for applications in different fields. However, the common concern is the quality of both the chemical structure information and associated experimental data. This is especially true when those data are collected from multiple sources as chemical substance mappings can contain many duplicate structures and molecular inconsistencies. Such issues can impact the resulting molecular descriptors and their mappings to experimental data and, subsequently, the quality of the derived models in terms of accuracy, repeatability, and reliability. Herein we describe the development of an automated workflow to standardize chemical structures according to a set of standard rules and generate two and/or three-dimensional "QSAR-ready" forms prior to the calculation of molecular descriptors. The workflow was designed in the KNIME workflow environment and consists of three high-level steps. First, a structure encoding is read, and then the resulting in-memory representation is cross-referenced with any existing identifiers for consistency. Finally, the structure is standardized using a series of operations including desalting, stripping of stereochemistry (for two-dimensional structures), standardization of tautomers and nitro groups, valence correction, neutralization when possible, and then removal of duplicates. This workflow was initially developed to support collaborative modeling QSAR projects to ensure consistency of the results from the different participants. It was then updated and generalized for other modeling applications. This included modification of the "QSAR-ready" workflow to generate "MS-ready structures" to support the generation of substance mappings and searches for software applications related to non-targeted analysis mass spectrometry. Both QSAR and MS-ready workflows are freely available in KNIME, via standalone versions on GitHub, and as docker container resources for the scientific community. Scientific contribution: This work pioneers an automated workflow in KNIME, systematically standardizing chemical structures to ensure their readiness for QSAR modeling and broader scientific applications. By addressing data quality concerns through desalting, stereochemistry stripping, and normalization, it optimizes molecular descriptors' accuracy and reliability. The freely available resources in KNIME, GitHub, and docker containers democratize access, benefiting collaborative research and advancing diverse modeling endeavors in chemistry and mass spectrometry.

2.
Stud Health Technol Inform ; 310: 579-583, 2024 Jan 25.
Article in English | MEDLINE | ID: mdl-38269875

ABSTRACT

The reliable identification of skin and soft tissue infections (SSTIs) from electronic health records is important for a number of applications, including quality improvement, clinical guideline construction, and epidemiological analysis. However, in the United States, types of SSTIs (e.g. is the infection purulent or non-purulent?) are not captured reliably in structured clinical data. With this work, we trained and evaluated a rule-based clinical natural language processing system using 6,576 manually annotated clinical notes derived from the United States Veterans Health Administration (VA) with the goal of automatically extracting and classifying SSTI subtypes from clinical notes. The trained system achieved mention- and document-level performance metrics of the range 0.39 to 0.80 for mention level classification and 0.49 to 0.98 for document level classification.


Subject(s)
Soft Tissue Infections , United States , Humans , Soft Tissue Infections/diagnosis , Skin , Benchmarking , Electronic Health Records , Natural Language Processing
3.
Stud Health Technol Inform ; 310: 659-663, 2024 Jan 25.
Article in English | MEDLINE | ID: mdl-38269891

ABSTRACT

Electronic Nicotine Delivery Systems (ENDS) use has increased substantially in the United States since 2010. To date, there is limited evidence regarding the nature and extent of ENDS documentation in the clinical note. In this work we investigate the effectiveness of different approaches to identify a patient's documented ENDS use. We report on the development and validation of a natural language processing system to identify patients with explicit documentation of ENDS using a large national cohort of patients at the United States Department of Veterans Affairs.


Subject(s)
Electronic Nicotine Delivery Systems , Vaping , United States , Humans , Natural Language Processing , Documentation , United States Department of Veterans Affairs
4.
JMIR Form Res ; 7: e49325, 2023 Sep 07.
Article in English | MEDLINE | ID: mdl-37676723

ABSTRACT

BACKGROUND: In most countries, men are more likely to die by suicide than women. Adherence to dominant masculine norms, such as being self-reliant, is linked to suicide in men in Western cultures. We created a suicide prevention media campaign, "Boys Do Cry," designed to challenge the "self-reliance" norm and encourage help-seeking in men. A music video was at the core of the campaign, which was an adapted version of the "Boys Don't Cry" song from "The Cure." There is evidence that suicide prevention media campaigns can encourage help-seeking for mental health difficulties. OBJECTIVE: We aimed to explore the reach, engagement, and themes of discussion prompted by the Boys Do Cry campaign on Twitter. METHODS: We used Twitter analytics data to investigate the reach and engagement of the Boys Do Cry campaign, including analyzing the characteristics of tweets posted by the campaign's hosts. Throughout the campaign and immediately after, we also used Twitter data derived from the Twitter Application Programming Interface to analyze the tweeting patterns of users related to the campaign. In addition, we qualitatively analyzed the content of Boys Do Cry-related tweets during the campaign period. RESULTS: During the campaign, Twitter users saw the tweets posted by the hosts of the campaign a total of 140,650 times and engaged with its content a total of 4477 times. The 10 highest-performing tweets by the campaign hosts involved either a video or an image. Among the 10 highest-performing tweets, the first was one that included the campaign's core video; the second was a screenshot of the tweet posted by Robert Smith, the lead singer of The Cure, sharing the Boys Do Cry campaign's video and tagging the campaign's hosts. In addition, the pattern of Twitter activity for the campaign-related tweets was considerably higher during the campaign than in the immediate postcampaign period, with half of the activity occurring during the first week of the campaign when Robert Smith promoted the campaign. Some of the key topics of discussions prompted by the Boys Do Cry campaign on Twitter involved users supporting the campaign; referencing the original song, band, or lead singer; reiterating the campaign's messages; and having emotional responses to the campaign. CONCLUSIONS: This study demonstrates that a brief media campaign such as Boys Do Cry can achieve good reach and engagement and can prompt discussions on Twitter about masculinity and suicide. Such discussions may lead to greater awareness about the importance of seeking help and providing support to those with mental health difficulties. However, this study suggests that longer, more intensive campaigns may be needed in order to amplify and sustain these results.

5.
Nucleic Acids Res ; 51(W1): W78-W82, 2023 07 05.
Article in English | MEDLINE | ID: mdl-37194699

ABSTRACT

Access to computationally based visualization tools to navigate chemical space has become more important due to the increasing size and diversity of publicly accessible databases, associated compendiums of high-throughput screening (HTS) results, and other descriptor and effects data. However, application of these techniques requires advanced programming skills that are beyond the capabilities of many stakeholders. Here we report the development of the second version of the ChemMaps.com webserver (https://sandbox.ntp.niehs.nih.gov/chemmaps/) focused on environmental chemical space. The chemical space of ChemMaps.com v2.0, released in 2022, now includes approximately one million environmental chemicals from the EPA Distributed Structure-Searchable Toxicity (DSSTox) inventory. ChemMaps.com v2.0 incorporates mapping of HTS assay data from the U.S. federal Tox21 research collaboration program, which includes results from around 2000 assays tested on up to 10 000 chemicals. As a case example, we showcased chemical space navigation for Perfluorooctanoic Acid (PFOA), part of the Per- and polyfluoroalkyl substances (PFAS) chemical family, which are of significant concern for their potential effects on human health and the environment.


Subject(s)
Databases, Chemical , High-Throughput Screening Assays , Software , Environment
6.
JCO Clin Cancer Inform ; 7: e2200131, 2023 01.
Article in English | MEDLINE | ID: mdl-36753686

ABSTRACT

PURPOSE: Histopathologic features are critical for studying risk factors of colorectal polyps, but remain deeply embedded within unstructured pathology reports, requiring costly and time-consuming manual abstraction for research. In this study, we developed and evaluated a natural language processing (NLP) pipeline to automatically extract histopathologic features of colorectal polyps from pathology reports, with an emphasis on individual polyp size. These data were then linked with structured electronic health record (EHR) data, creating an analysis-ready epidemiologic data set. METHODS: We obtained 24,584 pathology reports from colonoscopies performed at the University of Utah's Gastroenterology Clinic. Two investigators annotated 350 reports to determine inter-rater agreement, develop an annotation scheme, and create a reference standard for performance evaluation. The pipeline was then developed, and performance was compared against the reference for extracting polyp location, histology, size, shape, dysplasia, and the number of polyps. Finally, the pipeline was applied to 24,225 unseen reports and NLP-extracted data were linked with structured EHR data. RESULTS: Across all features, our pipeline achieved a precision of 98.9%, a recall of 98.0%, and an F1-score of 98.4%. In patients with polyps, the pipeline correctly extracted 95.6% of sizes, 97.2% of polyp locations, 97.8% of histology, 98.3% of shapes, and 98.3% of dysplasia levels. When applied to unseen data, the pipeline classified 12,889 patients as having polyps, 4,907 patients without polyps, and extracted the features of 28,387 polyps. Tubular adenomas were the most common subtype (55.9%), 8.1% of polyps were advanced adenomas, and the mean polyp size was 0.57 (±0.4) cm. CONCLUSION: Our pipeline extracted histopathologic features of colorectal polyps from colonoscopy pathology reports, most notably individual polyp sizes, with considerable accuracy. This study demonstrates the utility of NLP for extracting polyp features and linking these data with EHR data to create an epidemiologic data set to study colorectal polyp risk factors and outcomes.


Subject(s)
Adenoma , Colonic Polyps , Colorectal Neoplasms , Humans , Colonic Polyps/diagnosis , Colonic Polyps/epidemiology , Colonic Polyps/pathology , Colorectal Neoplasms/diagnosis , Colorectal Neoplasms/epidemiology , Colorectal Neoplasms/pathology , Natural Language Processing , Adenoma/diagnosis , Adenoma/epidemiology , Adenoma/pathology , Epidemiologic Studies , Hyperplasia
7.
J Med Internet Res ; 25: e36667, 2023 02 27.
Article in English | MEDLINE | ID: mdl-36848191

ABSTRACT

BACKGROUND: The use and acceptance of medicinal cannabis is on the rise across the globe. To support the interests of public health, evidence relating to its use, effects, and safety is required to match this community demand. Web-based user-generated data are often used by researchers and public health organizations for the investigation of consumer perceptions, market forces, population behaviors, and for pharmacoepidemiology. OBJECTIVE: In this review, we aimed to summarize the findings of studies that have used user-generated text as a data source to study medicinal cannabis or the use of cannabis as medicine. Our objectives were to categorize the insights provided by social media research on cannabis as medicine and describe the role of social media for consumers using medicinal cannabis. METHODS: The inclusion criteria for this review were primary research studies and reviews that reported on the analysis of web-based user-generated content on cannabis as medicine. The MEDLINE, Scopus, Web of Science, and Embase databases were searched from January 1974 to April 2022. RESULTS: We examined 42 studies published in English and found that consumers value their ability to exchange experiences on the web and tend to rely on web-based information sources. Cannabis discussions have portrayed the substance as a safe and natural medicine to help with many health conditions including cancer, sleep disorders, chronic pain, opioid use disorders, headaches, asthma, bowel disease, anxiety, depression, and posttraumatic stress disorder. These discussions provide a rich resource for researchers to investigate medicinal cannabis-related consumer sentiment and experiences, including the opportunity to monitor cannabis effects and adverse events, given the anecdotal and often biased nature of the information is properly accounted for. CONCLUSIONS: The extensive web-based presence of the cannabis industry coupled with the conversational nature of social media discourse results in rich but potentially biased information that is often not well-supported by scientific evidence. This review summarizes what social media is saying about the medicinal use of cannabis and discusses the challenges faced by health governance agencies and professionals to make use of web-based resources to both learn from medicinal cannabis users and provide factual, timely, and reliable evidence-based health information to consumers.


Subject(s)
Cannabis , Medical Marijuana , Social Media , Humans , Medical Marijuana/therapeutic use , Public Opinion , Public Health
8.
PLoS One ; 18(1): e0269143, 2023.
Article in English | MEDLINE | ID: mdl-36662832

ABSTRACT

The use of cannabis for medicinal purposes has increased globally over the past decade since patient access to medicinal cannabis has been legislated across jurisdictions in Europe, the United Kingdom, the United States, Canada, and Australia. Yet, evidence relating to the effect of medical cannabis on the management of symptoms for a suite of conditions is only just emerging. Although there is considerable engagement from many stakeholders to add to the evidence base through randomized controlled trials, many gaps in the literature remain. Data from real-world and patient reported sources can provide opportunities to address this evidence deficit. This real-world data can be captured from a variety of sources such as found in routinely collected health care and health services records that include but are not limited to patient generated data from medical, administrative and claims data, patient reported data from surveys, wearable trackers, patient registries, and social media. In this systematic scoping review, we seek to understand the utility of online user generated text into the use of cannabis as a medicine. In this scoping review, we aimed to systematically search published literature to examine the extent, range, and nature of research that utilises user-generated content to examine to cannabis as a medicine. The objective of this methodological review is to synthesise primary research that uses social media discourse and internet search engine queries to answer the following questions: (i) In what way, is online user-generated text used as a data source in the investigation of cannabis as a medicine? (ii) What are the aims, data sources, methods, and research themes of studies using online user-generated text to discuss the medicinal use of cannabis. We conducted a manual search of primary research studies which used online user-generated text as a data source using the MEDLINE, Embase, Web of Science, and Scopus databases in October 2022. Editorials, letters, commentaries, surveys, protocols, and book chapters were excluded from the review. Forty-two studies were included in this review, twenty-two studies used manually labelled data, four studies used existing meta-data (Google trends/geo-location data), two studies used data that was manually coded using crowdsourcing services, and two used automated coding supplied by a social media analytics company, fifteen used computational methods for annotating data. Our review reflects a growing interest in the use of user-generated content for public health surveillance. It also demonstrates the need for the development of a systematic approach for evaluating the quality of social media studies and highlights the utility of automatic processing and computational methods (machine learning technologies) for large social media datasets. This systematic scoping review has shown that user-generated content as a data source for studying cannabis as a medicine provides another means to understand how cannabis is perceived and used in the community. As such, it provides another potential 'tool' with which to engage in pharmacovigilance of, not only cannabis as a medicine, but also other novel therapeutics as they enter the market.


Subject(s)
Cannabis , Medicine , Social Media , Humans , Delivery of Health Care , United Kingdom
9.
J Med Internet Res ; 24(11): e35974, 2022 11 16.
Article in English | MEDLINE | ID: mdl-36383417

ABSTRACT

BACKGROUND: Medicinal cannabis is increasingly being used for a variety of physical and mental health conditions. Social media and web-based health platforms provide valuable, real-time, and cost-effective surveillance resources for gleaning insights regarding individuals who use cannabis for medicinal purposes. This is particularly important considering that the evidence for the optimal use of medicinal cannabis is still emerging. Despite the web-based marketing of medicinal cannabis to consumers, currently, there is no robust regulatory framework to measure clinical health benefits or individual experiences of adverse events. In a previous study, we conducted a systematic scoping review of studies that contained themes of the medicinal use of cannabis and used data from social media and search engine results. This study analyzed the methodological approaches and limitations of these studies. OBJECTIVE: We aimed to examine research approaches and study methodologies that use web-based user-generated text to study the use of cannabis as a medicine. METHODS: We searched MEDLINE, Scopus, Web of Science, and Embase databases for primary studies in the English language from January 1974 to April 2022. Studies were included if they aimed to understand web-based user-generated text related to health conditions where cannabis is used as a medicine or where health was mentioned in general cannabis-related conversations. RESULTS: We included 42 articles in this review. In these articles, Twitter was used 3 times more than other computer-generated sources, including Reddit, web-based forums, GoFundMe, YouTube, and Google Trends. Analytical methods included sentiment assessment, thematic analysis (manual and automatic), social network analysis, and geographic analysis. CONCLUSIONS: This study is the first to review techniques used by research on consumer-generated text for understanding cannabis as a medicine. It is increasingly evident that consumer-generated data offer opportunities for a greater understanding of individual behavior and population health outcomes. However, research using these data has some limitations that include difficulties in establishing sample representativeness and a lack of methodological best practices. To address these limitations, deidentified annotated data sources should be made publicly available, researchers should determine the origins of posts (organizations, bots, power users, or ordinary individuals), and powerful analytical techniques should be used.


Subject(s)
Cannabis , Medical Marijuana , Medicine , Mental Disorders , Social Media , Humans , Medical Marijuana/therapeutic use
10.
JMIR Infodemiology ; 2(2): e36941, 2022.
Article in English | MEDLINE | ID: mdl-36196144

ABSTRACT

Background: Since COVID-19 was declared a pandemic by the World Health Organization on March 11, 2020, the disease has had an unprecedented impact worldwide. Social media such as Reddit can serve as a resource for enhancing situational awareness, particularly regarding monitoring public attitudes and behavior during the crisis. Insights gained can then be utilized to better understand public attitudes and behaviors during the COVID-19 crisis, and to support communication and health-promotion messaging. Objective: The aim of this study was to compare public attitudes toward the 2020-2021 COVID-19 pandemic across four predominantly English-speaking countries (the United States, the United Kingdom, Canada, and Australia) using data derived from the social media platform Reddit. Methods: We utilized a topic modeling natural language processing method (more specifically latent Dirichlet allocation). Topic modeling is a popular unsupervised learning technique that can be used to automatically infer topics (ie, semantically related categories) from a large corpus of text. We derived our data from six country-specific, COVID-19-related subreddits (r/CoronavirusAustralia, r/CoronavirusDownunder, r/CoronavirusCanada, r/CanadaCoronavirus, r/CoronavirusUK, and r/coronavirusus). We used topic modeling methods to investigate and compare topics of concern for each country. Results: Our consolidated Reddit data set consisted of 84,229 initiating posts and 1,094,853 associated comments collected between February and November 2020 for the United States, the United Kingdom, Canada, and Australia. The volume of posting in COVID-19-related subreddits declined consistently across all four countries during the study period (February 2020 to November 2020). During lockdown events, the volume of posts peaked. The UK and Australian subreddits contained much more evidence-based policy discussion than the US or Canadian subreddits. Conclusions: This study provides evidence to support the contention that there are key differences between salient topics discussed across the four countries on the Reddit platform. Further, our approach indicates that Reddit data have the potential to provide insights not readily apparent in survey-based approaches.

12.
BMC Pediatr ; 22(1): 167, 2022 03 31.
Article in English | MEDLINE | ID: mdl-35361157

ABSTRACT

BACKGROUND & OBJECTIVES: This study aims to explore and elucidate parents' experience of newborn screening [NBS], with the overarching goal of identifying desiderata for the development of informatics-based educational and health management resources. METHODS: We conducted four focus groups and four one-on-one qualitative interviews with a total of 35 participants between March and September 2020. Participants were grouped into three types: parents who had received true positive newborn screening results; parents who had received false positive results; and soon-to-be parents who had no direct experience of the screening process. Interview data were subjected to analysis using an inductive, constant comparison approach. RESULTS: Results are divided into five sections: (1) experiences related to the process of receiving NBS results and prior knowledge of the NBS program; (2) approaches to the management of a child's medical data; (3) sources of additional informational and emotional support; (4) barriers faced by parents navigating the health system; and (5) recommendations and suggestions for new parents experiencing the NBS process. CONCLUSION: Our analysis revealed a wide range of experiences of, and attitudes towards the newborn screening program and the wider newborn screening system. While parents' view of the screening process was - on the whole - positive, some participants reported experiencing substantial frustration, particularly related to how results are initially communicated and difficulties in accessing reliable, timely information. This frustration with current information management and education resources indicates a role for informatics-based approaches in addressing parents' information needs.


Subject(s)
Neonatal Screening , Parents , Child , Focus Groups , Humans , Infant, Newborn , Neonatal Screening/psychology , Pain , Parents/psychology , Qualitative Research
13.
J Sch Nurs ; 38(1): 74-83, 2022 Feb.
Article in English | MEDLINE | ID: mdl-33944636

ABSTRACT

School nurses are the most accessible health care providers for many young people including adolescents and young adults. Early identification of depression results in improved outcomes, but little information is available comprehensively describing depressive symptoms specific to this population. The aim of this study was to develop a taxonomy of depressive symptoms that were manifested and described by young people based on a scoping review and content analysis. Twenty-five journal articles that included narrative descriptions of depressive symptoms in young people were included. A total of 60 depressive symptoms were identified and categorized into five dimensions: behavioral (n = 8), cognitive (n = 14), emotional (n = 15), interpersonal (n = 13), and somatic (n = 10). This comprehensive depression symptom taxonomy can help school nurses to identify young people who may experience depression and will support future research to better screen for depression.


Subject(s)
Depression , Adolescent , Humans , Young Adult
14.
Drug Alcohol Depend Rep ; 3: 100061, 2022 Jun.
Article in English | MEDLINE | ID: mdl-36845987

ABSTRACT

Background: Stigma associated with substance use can have severe negative consequences for physical and mental health and serve as a barrier to treatment. Yet, research on stigma processes and stigma reduction interventions is limited. Aim: We use a social media dataset to examine: 1) the nature of stigma-related experience related to substance use; and 2) salient affective and temporal factors in the use of three substances: alcohol, cannabis, and opioids. Methods: We harvested several years of data pertaining to three substances - alcohol, cannabis, and opioids - from Reddit, a popular social networking platform. For Part I, we selected posts based on stigma-related keywords, performed content analysis, and rendered word clouds to examine the nature of stigma associated with these substances. In Part II, we employed natural language processing in conjunction with hierarchical clustering and visualization to explore temporal and affective factors. Results: In Part I, internalized stigma was most commonly exhibited. Anticipated and enacted stigma were less common in posts relating to cannabis compared to the other two substances. Work, home, and school were important contexts in which stigma was observed. Part II showed that temporal markers were prominent; post authors shared stories of substance use journeys, and timelines of their experience with quitting and withdrawals. Shame, sadness, anxiety, and fear were common, with shame being more prominent in alcohol-related posts. Conclusion: Our findings highlight the importance of contextual factors in substance use recovery and stigma reduction and offer directions for future interventions.

15.
Drug Alcohol Depend ; 228: 109016, 2021 11 01.
Article in English | MEDLINE | ID: mdl-34560332

ABSTRACT

INTRODUCTION: The relationship between cannabis, tobacco, and vaping devices is both rapidly changing and poorly understood, with consumers rapidly shifting between use of all three product types. Given this dynamic and evolving landscape, there is an urgent need to monitor and better understand co-use, dual-use, and transition patterns between these products. This study describes work that utilizes social media - in this case, Reddit - in conjunction with automated Natural Language Processing (NLP) methods to better understand cannabis, tobacco, and vaping device product usage patterns. METHODS: We collected Reddit data from the period 2013-2018, sourced from eight popular, high-volume Reddit communities (subreddits) related to the three product categories. We then manually annotated (coded) a set of 2640 Reddit posts and trained a machine learning-based NLP algorithm to automatically identify and disambiguate between cannabis or tobacco mentions (both smoking and vaping) in Reddit posts. This classifier was then applied to all data derived from the eight subreddits, 767,788 posts in total. RESULTS: The NLP algorithm achieved an overall moderate performance (overall F-score of 0.77). When applied to our large corpus of Reddit posts, we discovered that over 10% of posts in the smoking cessation subreddit r/stopsmoking were classified as referring to vaping nicotine, and that only 2% of posts from the subreddits r/electronic_cigarette and r/vaping were classified as referring to smoking (tobacco) cessation. CONCLUSIONS: This study presents the results of applying an NLP algorithm designed to identify and distinguish between cannabis and tobacco mentions (both smoking and vaping) in Reddit posts, hence contributing to our currently limited understanding of co-use, dual-use, and transition patterns between these products.


Subject(s)
Cannabis , Electronic Nicotine Delivery Systems , Social Media , Tobacco Products , Vaping , Humans , Natural Language Processing , Prevalence , Nicotiana
16.
AMIA Annu Symp Proc ; 2021: 343-351, 2021.
Article in English | MEDLINE | ID: mdl-35308940

ABSTRACT

Use of Electronic Nicotine Delivery Systems (ENDS, colloquially known as "electronic cigarettes") has increased substantially in the United States in the decade since 2010. However, currently relatively little is known regarding the documentation of ENDS use in clinical notes. With this study, we describe the development of an annotation scheme (and associated annotated corpus) consisting of 4,351 ENDS mentions derived from Department of Veterans Affairs clinical notes during the period 2010-2020. Analysis of our corpus provides important insights into ENDS documentation practices at the VA, in addition to providing a resource for the future development and validation of Natural Language Processing algorithms capable of reliably identifying ENDS-use status.


Subject(s)
Electronic Nicotine Delivery Systems , Vaping , Veterans , Documentation , Humans , Natural Language Processing , United States
17.
Front Public Health ; 9: 738513, 2021.
Article in English | MEDLINE | ID: mdl-35071153

ABSTRACT

Background: Perceptions of tobacco, cannabis, and electronic nicotine delivery systems (ENDS) are continually evolving in the United States. Exploring these characteristics through user generated text sources may provide novel insights into product use behavior that are challenging to identify using survey-based methods. The objective of this study was to compare the topics frequently discussed among Reddit members in cannabis, tobacco, and ENDS-specific subreddits. Methods: We collected 643,070 posts on the social media site Reddit between January 2013 and December 2018. We developed and validated an annotation scheme, achieving a high level of agreement among annotators. We then manually coded a subset of 2,630 posts for their content with relation to experiences and use of the three products of interest, and further developed word cloud representations of the words contained in these posts. Finally, we applied Latent Dirichlet Allocation (LDA) topic modeling to the 643,070 posts to identify emerging themes related to cannabis, tobacco, and ENDS products being discussed on Reddit. Results: Our manual annotation process yielded 2,148 (81.6%) posts that contained a mention(s) of either cannabis, tobacco, or ENDS with 1,537 (71.5%) of these posts mentioning cannabis, 421 (19.5%) mentioning ENDS, and 264 (12.2%) mentioning tobacco. In cannabis-specific subreddits, personal experiences with cannabis, cannabis legislation, health effects of cannabis use, methods and forms of cannabis, and the cultivation of cannabis were commonly discussed topics. The discussion in tobacco-specific subreddits often focused on the discussion of brands and types of combustible tobacco, as well as smoking cessation experiences and advice. In ENDS-specific subreddits, topics often included ENDS accessories and parts, flavors and nicotine solutions, procurement of ENDS, and the use of ENDS for smoking cessation. Conclusion: Our findings highlight the posting and participation patterns of Reddit members in cannabis, tobacco, and ENDS-specific subreddits and provide novel insights into aspects of personal use regarding these products. These findings complement epidemiologic study designs and highlight the potential of using specific subreddits to explore personal experiences with cannabis, ENDS, and tobacco products.


Subject(s)
Cannabis , Tobacco Products , Vaping , Humans , Natural Language Processing , Nicotiana , United States
18.
JMIR Public Health Surveill ; 6(3): e19975, 2020 09 02.
Article in English | MEDLINE | ID: mdl-32876579

ABSTRACT

BACKGROUND: Increases in electronic nicotine delivery system (ENDS) use among high school students from 2017 to 2019 appear to be associated with the increasing popularity of the ENDS device JUUL. OBJECTIVE: We employed a content analysis approach in conjunction with natural language processing methods using Twitter data to understand salient themes regarding JUUL use on Twitter, sentiment towards JUUL, and underage JUUL use. METHODS: Between July 2018 and August 2019, 11,556 unique tweets containing a JUUL-related keyword were collected. We manually annotated 4000 tweets for JUUL-related themes of use and sentiment. We used 3 machine learning algorithms to classify positive and negative JUUL sentiments as well as underage JUUL mentions. RESULTS: Of the annotated tweets, 78.80% (3152/4000) contained a specific mention of JUUL. Only 1.43% (45/3152) of tweets mentioned using JUUL as a method of smoking cessation, and only 6.85% (216/3152) of tweets mentioned the potential health effects of JUUL use. Of the machine learning methods used, the random forest classifier was the best performing algorithm among all 3 classification tasks (ie, positive sentiment, negative sentiment, and underage JUUL mentions). CONCLUSIONS: Our findings suggest that a vast majority of Twitter users are not using JUUL to aid in smoking cessation nor do they mention the potential health benefits or detriments of JUUL use. Using machine learning algorithms to identify tweets containing underage JUUL mentions can support the timely surveillance of JUUL habits and opinions, further assisting youth-targeted public health intervention strategies.


Subject(s)
Adolescent Behavior/psychology , Electronic Nicotine Delivery Systems/standards , Social Media/instrumentation , Adolescent , Electronic Nicotine Delivery Systems/statistics & numerical data , Female , Humans , Machine Learning/statistics & numerical data , Male , Natural Language Processing , Social Media/statistics & numerical data
19.
Nucleic Acids Res ; 48(W1): W586-W590, 2020 07 02.
Article in English | MEDLINE | ID: mdl-32421835

ABSTRACT

High-throughput screening (HTS) research programs for drug development or chemical hazard assessment are designed to screen thousands of molecules across hundreds of biological targets or pathways. Most HTS platforms use fluorescence and luminescence technologies, representing more than 70% of the assays in the US Tox21 research consortium. These technologies are subject to interferent signals largely explained by chemicals interacting with light spectrum. This phenomenon results in up to 5-10% of false positive results, depending on the chemical library used. Here, we present the InterPred webserver (version 1.0), a platform to predict such interference chemicals based on the first large-scale chemical screening effort to directly characterize chemical-assay interference, using assays in the Tox21 portfolio specifically designed to measure autofluorescence and luciferase inhibition. InterPred combines 17 quantitative structure activity relationship (QSAR) models built using optimized machine learning techniques and allows users to predict the probability that a new chemical will interfere with different combinations of cellular and technology conditions. InterPred models have been applied to the entire Distributed Structure-Searchable Toxicity (DSSTox) Database (∼800,000 chemicals). The InterPred webserver is available at https://sandbox.ntp.niehs.nih.gov/interferences/.


Subject(s)
High-Throughput Screening Assays , Software , Artifacts , Fluorescence , Internet , Machine Learning , Pharmaceutical Preparations/chemistry , Quantitative Structure-Activity Relationship , Workflow
20.
Int J Med Inform ; 139: 104122, 2020 07.
Article in English | MEDLINE | ID: mdl-32339929

ABSTRACT

BACKGROUND: In ambulatory care settings, physicians largely rely on clinical guidelines and guideline-based clinical decision support (CDS) systems to make decisions on hypertension treatment. However, current clinical evidence, which is the knowledge base of clinical guidelines, is insufficient to support definitive optimal treatment. OBJECTIVE: The goal of this study is to test the feasibility of using deep learning predictive models to identify optimal hypertension treatment pathways for individual patients, based on empirical data available from an electronic health record database. MATERIALS AND METHODS: This study used data on 245,499 unique patients who were initially diagnosed with essential hypertension and received anti-hypertensive treatment from January 1, 2001 to December 31, 2010 in ambulatory care settings. We used recurrent neural networks (RNN), including long short-term memory (LSTM) and bi-directional LSTM, to create risk-adapted models to predict the probability of reaching the BP control targets associated with different BP treatment regimens. The ratios for the training set, the validation set, and the test set were 6:2:2. The samples for each set were independently randomly drawn from individual years with corresponding proportions. RESULTS: The LSTM models achieved high accuracy when predicting individual probability of reaching BP goals on different treatments: for systolic BP (<140 mmHg), diastolic BP (<90 mmHg), and both systolic BP and diastolic BP (<140/90 mmHg), F1-scores were 0.928, 0.960, and 0.913, respectively. CONCLUSIONS: The results demonstrated the potential of using predictive models to select optimal hypertension treatment pathways. Along with clinical guidelines and guideline-based CDS systems, the LSTM models could be used as a powerful decision-support tool to form risk-adapted, personalized strategies for hypertension treatment plans, especially for difficult-to-treat patients.


Subject(s)
Antihypertensive Agents/therapeutic use , Blood Pressure/drug effects , Hypertension/diagnosis , Neural Networks, Computer , Patient Care Planning/standards , Practice Guidelines as Topic/standards , Blood Pressure Determination , Databases, Factual , Electronic Health Records , Feasibility Studies , Humans , Hypertension/drug therapy , Monitoring, Physiologic
SELECTION OF CITATIONS
SEARCH DETAIL
...