Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 82
Filtrar
1.
JMIR Pediatr Parent ; 7: e56919, 2024 May 29.
Artigo em Inglês | MEDLINE | ID: mdl-38809591

RESUMO

BACKGROUND: Social media have shown the potential to support type 1 diabetes self-management by providing informational, emotional, and peer-to-peer support. However, the perceptions of young people and health care professionals' (HCPs) toward the use of social media for type 1 diabetes self-management have not been systematically reviewed. OBJECTIVE: The aim of this study is to explore and summarize the experiences and views of young people with type 1 diabetes and their HCPs on using social media for self-management across qualitative findings. METHODS: We searched MEDLINE, Embase, PsycINFO, and CINAHL from 2012 to 2023 using Medical Subject Heading terms and text words related to type 1 diabetes and social media. We screened and selected the studies according to the inclusion and exclusion criteria. We quality appraised and characterized the included studies and conducted a thematic synthesis. RESULTS: We included 11 studies in our synthesis. A total of 9 of them were qualitative and 2 were mixed methods studies. Ten focused on young people with type 1 diabetes and 1 on HCPs. All used content analysis and were of moderate to high quality. Thirteen descriptive themes were yielded by our thematic synthesis, contributing to five analytic themes: (1) differences in how young people interact with social media, (2) characteristics of social media platforms that influence their use and uptake for type 1 diabetes self-management, (3) social media as a source of information, (4) impact on young people's coping and emotional well-being, and (5) impact on support from and relationships with HCPs and services. CONCLUSIONS: The synthesis suggests that we should consider leveraging social media's peer support capabilities to augment the traditional services for young people with type 1 diabetes. However, the patients may have privacy concerns about HCPs' involvement in their online activities. This warrants an update of existing guidelines to help young people use social media safely for self-managing their diabetes.

2.
Ann Rheum Dis ; 83(8): 1082-1091, 2024 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-38575324

RESUMO

INTRODUCTION: At the beginning of the COVID-19 pandemic, the UK's Scientific Committee issued extreme social distancing measures, termed 'shielding', aimed at a subpopulation deemed extremely clinically vulnerable to infection. National guidance for risk stratification was based on patients' age, comorbidities and immunosuppressive therapies, including biologics that are not captured in primary care records. This process required considerable clinician time to manually review outpatient letters. Our aim was to develop and evaluate an automated shielding algorithm by text-mining outpatient letter diagnoses and medications, reducing the need for future manual review. METHODS: Rheumatology outpatient letters from a large UK foundation trust were retrieved. Free-text diagnoses were processed using Intelligent Medical Objects software (Concept Tagger), which used interface terminology for each condition mapped to Systematized Medical Nomenclature for Medicine-Clinical Terminology (SNOMED-CT) codes. We developed the Medication Concept Recognition tool (Named Entity Recognition) to retrieve medications' type, dose, duration and status (active/past) at the time of the letter. Age, diagnosis and medication variables were then combined to calculate a shielding score based on the most recent letter. The algorithm's performance was evaluated using clinical review as the gold standard. The time taken to deploy the developed algorithm on a larger patient subset was measured. RESULTS: In total, 5942 free-text diagnoses were extracted and mapped to SNOMED-CT, with 13 665 free-text medications (n=803 patients). The automated algorithm demonstrated a sensitivity of 80% (95% CI: 75%, 85%) and specificity of 92% (95% CI: 90%, 94%). Positive likelihood ratio was 10 (95% CI: 8, 14), negative likelihood ratio was 0.21 (95% CI: 0.16, 0.28) and F1 score was 0.81. Evaluation of mismatches revealed that the algorithm performed correctly against the gold standard in most cases. The developed algorithm was then deployed on records from an additional 15 865 patients, which took 18 hours for data extraction and 1 hour to deploy. DISCUSSION: An automated algorithm for risk stratification has several advantages including reducing clinician time for manual review to allow more time for direct care, improving efficiency and increasing transparency in individual patient communication. It has the potential to be adapted for future public health initiatives that require prompt automated review of hospital outpatient letters.


Assuntos
Algoritmos , COVID-19 , Mineração de Dados , Humanos , COVID-19/prevenção & controle , Reino Unido , Mineração de Dados/métodos , SARS-CoV-2 , Doenças Reumáticas/tratamento farmacológico , Pessoa de Meia-Idade , Masculino , Reumatologia/métodos , Feminino , Idoso , Medição de Risco/métodos , Pandemias , Adulto
3.
Artif Intell Med ; 151: 102845, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38555848

RESUMO

BACKGROUND: Electronic health records (EHRs) are a valuable resource for data-driven medical research. However, the presence of protected health information (PHI) makes EHRs unsuitable to be shared for research purposes. De-identification, i.e. the process of removing PHI is a critical step in making EHR data accessible. Natural language processing has repeatedly demonstrated its feasibility in automating the de-identification process. OBJECTIVES: Our study aims to provide systematic evidence on how the de-identification of clinical free text written in English has evolved in the last thirteen years, and to report on the performances and limitations of the current state-of-the-art systems for the English language. In addition, we aim to identify challenges and potential research opportunities in this field. METHODS: A systematic search in PubMed, Web of Science, and the DBLP was conducted for studies published between January 2010 and February 2023. Titles and abstracts were examined to identify the relevant studies. Selected studies were then analysed in-depth, and information was collected on de-identification methodologies, data sources, and measured performance. RESULTS: A total of 2125 publications were identified for the title and abstract screening. 69 studies were found to be relevant. Machine learning (37 studies) and hybrid (26 studies) approaches are predominant, while six studies relied only on rules. The majority of the approaches were trained and evaluated on public corpora. The 2014 i2b2/UTHealth corpus is the most frequently used (36 studies), followed by the 2006 i2b2 (18 studies) and 2016 CEGS N-GRID (10 studies) corpora. CONCLUSION: Earlier de-identification approaches aimed at English were mainly rule and machine learning hybrids with extensive feature engineering and post-processing, while more recent performance improvements are due to feature-inferring recurrent neural networks. Current leading performance is achieved using attention-based neural models. Recent studies report state-of-the-art F1-scores (over 98 %) when evaluated in the manner usually adopted by the clinical natural language processing community. However, their performance needs to be more thoroughly assessed with different measures to judge their reliability to safely de-identify data in a real-world setting. Without additional manually labeled training data, state-of-the-art systems fail to generalise well across a wide range of clinical sub-domains.


Assuntos
Registros Eletrônicos de Saúde , Processamento de Linguagem Natural , Humanos , Aprendizado de Máquina
4.
Front Digit Health ; 6: 1211564, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38468693

RESUMO

Clinical text and documents contain very rich information and knowledge in healthcare, and their processing using state-of-the-art language technology becomes very important for building intelligent systems for supporting healthcare and social good. This processing includes creating language understanding models and translating resources into other natural languages to share domain-specific cross-lingual knowledge. In this work, we conduct investigations on clinical text machine translation by examining multilingual neural network models using deep learning such as Transformer based structures. Furthermore, to address the language resource imbalance issue, we also carry out experiments using a transfer learning methodology based on massive multilingual pre-trained language models (MMPLMs). The experimental results on three sub-tasks including (1) clinical case (CC), (2) clinical terminology (CT), and (3) ontological concept (OC) show that our models achieved top-level performances in the ClinSpEn-2022 shared task on English-Spanish clinical domain data. Furthermore, our expert-based human evaluations demonstrate that the small-sized pre-trained language model (PLM) outperformed the other two extra-large language models by a large margin in the clinical domain fine-tuning, which finding was never reported in the field. Finally, the transfer learning method works well in our experimental setting using the WMT21fb model to accommodate a new language space Spanish that was not seen at the pre-training stage within WMT21fb itself, which deserves more exploitation for clinical knowledge transformation, e.g. to investigate into more languages. These research findings can shed some light on domain-specific machine translation development, especially in clinical and healthcare fields. Further research projects can be carried out based on our work to improve healthcare text analytics and knowledge transformation. Our data is openly available for research purposes at: https://github.com/HECTA-UoM/ClinicalNMT.

5.
BMC Med Res Methodol ; 24(1): 68, 2024 Mar 17.
Artigo em Inglês | MEDLINE | ID: mdl-38494501

RESUMO

BACKGROUND: The challenging nature of studies with incarcerated populations and other offender groups can impede the conduct of research, particularly that involving complex study designs such as randomised control trials and clinical interventions. Providing an overview of study designs employed in this area can offer insights into this issue and how research quality may impact on health and justice outcomes. METHODS: We used a rule-based approach to extract study designs from a sample of 34,481 PubMed abstracts related to epidemiological criminology published between 1963 and 2023. The results were compared against an accepted hierarchy of scientific evidence. RESULTS: We evaluated our method in a random sample of 100 PubMed abstracts. An F1-Score of 92.2% was returned. Of 34,481 study abstracts, almost 40.0% (13,671) had an extracted study design. The most common study design was observational (37.3%; 5101) while experimental research in the form of trials (randomised, non-randomised) was present in 16.9% (2319). Mapped against the current hierarchy of scientific evidence, 13.7% (1874) of extracted study designs could not be categorised. Among the remaining studies, most were observational (17.2%; 2343) followed by systematic reviews (10.5%; 1432) with randomised controlled trials accounting for 8.7% (1196) of studies and meta-analysis for 1.4% (190) of studies. CONCLUSIONS: It is possible to extract epidemiological study designs from a large-scale PubMed sample computationally. However, the number of trials, systematic reviews, and meta-analysis is relatively small - just 1 in 5 articles. Despite an increase over time in the total number of articles, study design details in the abstracts were missing. Epidemiological criminology still lacks the experimental evidence needed to address the health needs of the marginalized and isolated population that is prisoners and offenders.


Assuntos
Criminosos , Prisioneiros , Humanos , Mineração de Dados , Projetos de Pesquisa
6.
Front Vet Sci ; 11: 1352239, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38322169

RESUMO

The development of natural language processing techniques for deriving useful information from unstructured clinical narratives is a fast-paced and rapidly evolving area of machine learning research. Large volumes of veterinary clinical narratives now exist curated by projects such as the Small Animal Veterinary Surveillance Network (SAVSNET) and VetCompass, and the application of such techniques to these datasets is already (and will continue to) improve our understanding of disease and disease patterns within veterinary medicine. In part one of this two part article series, we discuss the importance of understanding the lexical structure of clinical records and discuss the use of basic tools for filtering records based on key words and more complex rule based pattern matching approaches. We discuss the strengths and weaknesses of these approaches highlighting the on-going potential value in using these "traditional" approaches but ultimately recognizing that these approaches constrain how effectively information retrieval can be automated. This sets the scene for the introduction of machine-learning methodologies and the plethora of opportunities for automation of information extraction these present which is discussed in part two of the series.

7.
JMIR Form Res ; 7: e49721, 2023 Sep 22.
Artigo em Inglês | MEDLINE | ID: mdl-37738080

RESUMO

BACKGROUND: The emerging field of epidemiological criminology studies the intersection between public health and justice systems. To increase the value of and reduce waste in research activities in this area, it is important to perform transparent research priority setting considering the needs of research beneficiaries and end users along with a systematic assessment of the existing research activities to address gaps and harness opportunities. OBJECTIVE: In this study, we aimed to examine published research outputs in epidemiological criminology to assess gaps between published outputs and current research priorities identified by prison stakeholders. METHODS: A rule-based method was applied to 23,904 PubMed epidemiological criminology abstracts to extract the study determinants and outcomes (ie, "themes"). These were mapped against the research priorities identified by Australian prison stakeholders to assess the differences from research outputs. The income level of the affiliation country of the first authors was also identified to compare the ranking of research priorities in countries categorized by income levels. RESULTS: On an evaluation set of 100 abstracts, the identification of themes returned an F1-score of 90%, indicating reliable performance. More than 53.3% (11,927/22,361) of the articles had at least 1 extracted theme; the most common was substance use (1533/11,814, 12.97%), followed by HIV (1493/11,814, 12.64%). The infectious disease category (2949/11,814, 24.96%) was the most common research priority category, followed by mental health (2840/11,814, 24.04%) and alcohol and other drug use (2433/11,814, 20.59%). A comparison between the extracted themes and the stakeholder priorities showed an alignment for mental health, infectious diseases, and alcohol and other drug use. Although behavior- and juvenile-related themes were common, they did not feature as prison priorities. Most studies were conducted in high-income countries (10,083/11,814, 85.35%), while countries with the lowest income status focused half of their research on infectious diseases (47/91, 52%). CONCLUSIONS: The identification of research themes from PubMed epidemiological criminology research abstracts is possible through the application of a rule-based text mining method. The frequency of the investigated themes may reflect historical developments concerning disease prevalence, treatment advances, and the social understanding of illness and incarcerated populations. The differences between income status groups are likely to be explained by local health priorities and immediate health risks. Notable gaps between stakeholder research priorities and research outputs concerned themes that were more focused on social factors and systems and may reflect publication bias or self-publication selection, highlighting the need for further research on prison health services and the social determinants of health. Different jurisdictions, countries, and regions should undertake similar systematic and transparent research priority-setting processes.

8.
JMIR Med Inform ; 11: e45534, 2023 May 03.
Artigo em Inglês | MEDLINE | ID: mdl-37133927

RESUMO

BACKGROUND: Information stored within electronic health records is often recorded as unstructured text. Special computerized natural language processing (NLP) tools are needed to process this text; however, complex governance arrangements make such data in the National Health Service hard to access, and therefore, it is difficult to use for research in improving NLP methods. The creation of a donated databank of clinical free text could provide an important opportunity for researchers to develop NLP methods and tools and may circumvent delays in accessing the data needed to train the models. However, to date, there has been little or no engagement with stakeholders on the acceptability and design considerations of establishing a free-text databank for this purpose. OBJECTIVE: This study aimed to ascertain stakeholder views around the creation of a consented, donated databank of clinical free text to help create, train, and evaluate NLP for clinical research and to inform the potential next steps for adopting a partner-led approach to establish a national, funded databank of free text for use by the research community. METHODS: Web-based in-depth focus group interviews were conducted with 4 stakeholder groups (patients and members of the public, clinicians, information governance leads and research ethics members, and NLP researchers). RESULTS: All stakeholder groups were strongly in favor of the databank and saw great value in creating an environment where NLP tools can be tested and trained to improve their accuracy. Participants highlighted a range of complex issues for consideration as the databank is developed, including communicating the intended purpose, the approach to access and safeguarding the data, who should have access, and how to fund the databank. Participants recommended that a small-scale, gradual approach be adopted to start to gather donations and encouraged further engagement with stakeholders to develop a road map and set of standards for the databank. CONCLUSIONS: These findings provide a clear mandate to begin developing the databank and a framework for stakeholder expectations, which we would aim to meet with the databank delivery.

9.
Pharmacoepidemiol Drug Saf ; 32(6): 651-660, 2023 06.
Artigo em Inglês | MEDLINE | ID: mdl-36718594

RESUMO

PURPOSE: Routinely collected prescription data provides drug exposure information for pharmacoepidemiology, informing start/stop dates and dosage. Prescribing information includes structured data and unstructured free-text instructions, which can include inherent variability, such as "one to two tablets up to four times a day". Preparing drug exposure data from raw prescriptions to a research ready dataset is rarely fully reported, yet assumptions have considerable implications for pharmacoepidemiology. This may have bigger consequences for "pro re nata" (PRN) drugs. Our aim was, using a worked example of opioids and fracture risk, to examine the impact of incorporating narrative prescribing instructions and subsequent drug preparation assumptions on adverse event rates. METHODS: R-packages for extracting free-text medication prescription instructions in a structured form (doseminer) and an algorithm for transparently processing drug exposure information (drugprepr) were developed. Clinical Practice Research Datalink GOLD was used to define a cohort of adult new opioid users without prior cancer. A retrospective cohort study was performed using data between January 1, 2017 and July 31, 2018. We tested the impact of varying drug preparation assumptions by estimating the risk of opioids on fracture risk using Cox proportional hazards models. RESULTS: During the study window, 60 394 patients were identified with 190 754 opioid prescriptions. Free-text prescribing instruction variability, where there was flexibility in the number of tablets to be administered, was present in 42% prescriptions. Variations in the decisions made during preparing raw data for analysis led to marked differences impacting the event number (n = 303-415) and person years of drug exposure (5619-9832). The distribution of hazard ratios as a function of the decisions ranged from 2.71 (95% CI: 2.31, 3.18) to 3.24 (2.76, 3.82). CONCLUSIONS: Assumptions made during the drug preparation process, especially for those with variability in prescription instructions, can impact results of subsequent risk estimates. The developed R packages can improve transparency related to drug preparation assumptions, in line with best practice advocated by international pharmacoepidemiology guidelines.


Assuntos
Analgésicos Opioides , Farmacoepidemiologia , Adulto , Humanos , Analgésicos Opioides/uso terapêutico , Estudos Retrospectivos , Prescrições de Medicamentos , Algoritmos
10.
Interact J Med Res ; 11(2): e42891, 2022 Dec 05.
Artigo em Inglês | MEDLINE | ID: mdl-36469411

RESUMO

BACKGROUND: Epidemiological criminology refers to health issues affecting incarcerated and nonincarcerated offender populations, a group recognized as being challenging to conduct research with. Notwithstanding this, an urgent need exists for new knowledge and interventions to improve heath, justice, and social outcomes for this marginalized population. OBJECTIVE: To better understand research outputs in the field of epidemiological criminology, we examined the lead author's affiliation by analyzing peer-reviewed published outputs to determine countries and organizations (eg, universities, governmental and nongovernmental organizations) responsible for peer-reviewed publications. METHODS: We used a semiautomated approach to examine the first-author affiliations of 23,904 PubMed epidemiological studies related to incarcerated and offender populations published in English between 1946 and 2021. We also mapped research outputs to the World Justice Project Rule of Law Index to better understand whether there was a relationship between research outputs and the overall standard of a country's justice system. RESULTS: Nordic countries (Sweden, Norway, Finland, and Denmark) had the highest research outputs proportional to their incarcerated population, followed by Australia. University-affiliated first authors comprised 73.3% of published articles, with the Karolinska Institute (Sweden) being the most published, followed by the University of New South Wales (Australia). Government-affiliated first authors were on 8.9% of published outputs, and prison-affiliated groups were on 1%. Countries with the lowest research outputs also had the lowest scores on the Rule of Law Index. CONCLUSIONS: This study provides important information on who is publishing research in the epidemiological criminology field. This has implications for promoting research diversity, independence, funding equity, and partnerships between universities and government departments that control access to incarcerated and offending populations.

11.
JMIR Form Res ; 6(10): e39373, 2022 Oct 20.
Artigo em Inglês | MEDLINE | ID: mdl-36264613

RESUMO

BACKGROUND: To better understand domestic violence, data sources from multiple sectors such as police, justice, health, and welfare are needed. Linking police data to data collections from other agencies could provide unique insights and promote an all-of-government response to domestic violence. The New South Wales Police Force attends domestic violence events and records information in the form of both structured data and a free-text narrative, with the latter shown to be a rich source of information on the mental health status of persons of interest (POIs) and victims, abuse types, and sustained injuries. OBJECTIVE: This study aims to examine the concordance (ie, matching) between mental illness mentions extracted from the police's event narratives and mental health diagnoses from hospital and emergency department records. METHODS: We applied a rule-based text mining method on 416,441 domestic violence police event narratives between December 2005 and January 2016 to identify mental illness mentions for POIs and victims. Using different window periods (1, 3, 6, and 12 months) before and after a domestic violence event, we linked the extracted mental illness mentions of victims and POIs to clinical records from the Emergency Department Data Collection and the Admitted Patient Data Collection in New South Wales, Australia using a unique identifier for each individual in the same cohort. RESULTS: Using a 2-year window period (ie, 12 months before and after the domestic violence event), less than 1% (3020/416,441, 0.73%) of events had a mental illness mention and also a corresponding hospital record. About 16% of domestic violence events for both POIs (382/2395, 15.95%) and victims (101/631, 16.01%) had an agreement between hospital records and police narrative mentions of mental illness. A total of 51,025/416,441 (12.25%) events for POIs and 14,802/416,441 (3.55%) events for victims had mental illness mentions in their narratives but no hospital record. Only 841 events for POIs and 919 events for victims had a documented hospital record within 48 hours of the domestic violence event. CONCLUSIONS: Our findings suggest that current surveillance systems used to report on domestic violence may be enhanced by accessing rich information (ie, mental illness) contained in police text narratives, made available for both POIs and victims through the application of text mining. Additional insights can be gained by linkage to other health and welfare data collections.

12.
Healthc Anal (N Y) ; 2: None, 2022 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-36605918

RESUMO

Electronic nicotine delivery systems (ENDS) (also known as 'e-cigarettes') can support smoking cessation, although the long-term health impacts are not yet known. In 2019, a cluster of lung injury cases in the USA emerged that were ostensibly associated with ENDS use. Subsequent investigations revealed a link with vitamin E acetate, an additive used in some ENDS liquid products containing tetrahydrocannabinol (THC). This became known as the EVALI (E-cigarette or Vaping product use Associated Lung Injury) outbreak. While few cases were reported in the UK, the EVALI outbreak intensified attention on ENDS in general worldwide. We aimed to describe and explore public commentary and discussion on Twitter immediately before, during and following the peak of the EVALI outbreak using text mining techniques. Specifically, topic modelling, operationalised using Latent Dirichlet Allocation (LDA) models, was used to discern discussion topics in 189,658 tweets about ENDS (collected April-December 2019). Individual tweets and Twitter users were assigned to their dominant topics and countries respectively to enable international comparisons. A 10-topic LDA model fit the data best. We organised the ten topics into three broad themes for the purposes of reporting: informal vaping discussion; vaping policy discussion and EVALI news; and vaping commerce. Following EVALI, there were signs that informal vaping discussion topics decreased while discussion topics about vaping policy and the relative health risks and benefits of ENDS increased, not limited to THC products. Though subsequently attributed to THC products, the EVALI outbreak disrupted online public discourses about ENDS generally, amplifying health and policy commentary. There was a relatively stronger presence of commercially oriented tweets among UK Twitter users compared to USA users.

13.
PLoS One ; 16(12): e0260402, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34882714

RESUMO

A key goal of disease surveillance is to identify outbreaks of known or novel diseases in a timely manner. Such an outbreak occurred in the UK associated with acute vomiting in dogs between December 2019 and March 2020. We tracked this outbreak using the clinical free text component of anonymised electronic health records (EHRs) collected from a sentinel network of participating veterinary practices. We sourced the free text (narrative) component of each EHR supplemented with one of 10 practitioner-derived main presenting complaints (MPCs), with the 'gastroenteric' MPC identifying cases involved in the disease outbreak. Such clinician-derived annotation systems can suffer from poor compliance requiring retrospective, often manual, coding, thereby limiting real-time usability, especially where an outbreak of a novel disease might not present clinically as a currently recognised syndrome or MPC. Here, we investigate the use of an unsupervised method of EHR annotation using latent Dirichlet allocation topic-modelling to identify topics inherent within the clinical narrative component of EHRs. The model comprised 30 topics which were used to annotate EHRs spanning the natural disease outbreak and investigate whether any given topic might mirror the outbreak time-course. Narratives were annotated using the Gensim Library LdaModel module for the topic best representing the text within them. Counts for narratives labelled with one of the topics significantly matched the disease outbreak based on the practitioner-derived 'gastroenteric' MPC (Spearman correlation 0.978); no other topics showed a similar time course. Using artificially injected outbreaks, it was possible to see other topics that would match other MPCs including respiratory disease. The underlying topics were readily evaluated using simple word-cloud representations and using a freely available package (LDAVis) providing rapid insight into the clinical basis of each topic. This work clearly shows that unsupervised record annotation using topic modelling linked to simple text visualisations can provide an easily interrogable method to identify and characterise outbreaks and other anomalies of known and previously un-characterised diseases based on changes in clinical narratives.


Assuntos
Surtos de Doenças/veterinária , Doenças do Cão/epidemiologia , Gastroenterite/veterinária , Animais , Curadoria de Dados , Cães , Registros Eletrônicos de Saúde , Gastroenterite/epidemiologia , Vigilância da População , Reino Unido/epidemiologia , Aprendizado de Máquina não Supervisionado
16.
J Biomed Inform ; 123: 103915, 2021 11.
Artigo em Inglês | MEDLINE | ID: mdl-34600144

RESUMO

Temporal relation extraction between health-related events is a widely studied task in clinical Natural Language Processing (NLP). The current state-of-the-art methods mostly rely on engineered features (i.e., rule-based modelling) and sequence modelling, which often encodes a source sentence into a single fixed-length context. An obvious disadvantage of this fixed-length context design is its incapability to model longer sentences, as important temporal information in the clinical text may appear at different positions. To address this issue, we propose an Attention-based Bidirectional Long Short-Term Memory (Att-BiLSTM) model to enable learning the important semantic information in long source text segments and to better determine which parts of the text are most important. We experimented with two embeddings and compared the performances to traditional state-of-the-art methods that require elaborate linguistic pre-processing and hand-engineered features. The experimental results on the i2b2 2012 temporal relation test corpus show that the proposed method achieves a significant improvement with an F-score of 0.811, which is at least 10% better than state-of-the-art in the field. We show that the model can be remarkably effective at classifying temporal relations when provided with word embeddings trained on corpora in a general domain. Finally, we perform an error analysis to gain insight into the common errors made by the model.


Assuntos
Memória de Curto Prazo , Alta do Paciente , Humanos , Idioma , Processamento de Linguagem Natural , Semântica
17.
BMC Bioinformatics ; 22(Suppl 10): 387, 2021 Jul 29.
Artigo em Inglês | MEDLINE | ID: mdl-34325669

RESUMO

BACKGROUND: Stroke has an acute onset and a high mortality rate, making it one of the most fatal diseases worldwide. Its underlying biology and treatments have been widely studied both in the "Western" biomedicine and the Traditional Chinese Medicine (TCM). However, these two approaches are often studied and reported in insolation, both in the literature and associated databases. RESULTS: To aid research in finding effective prevention methods and treatments, we integrated knowledge from the literature and a number of databases (e.g. CID, TCMID, ETCM). We employed a suite of biomedical text mining (i.e. named-entity) approaches to identify mentions of genes, diseases, drugs, chemicals, symptoms, Chinese herbs and patent medicines, etc. in a large set of stroke papers from both biomedical and TCM domains. Then, using a combination of a rule-based approach with a pre-trained BioBERT model, we extracted and classified links and relationships among stroke-related entities as expressed in the literature. We construct StrokeKG, a knowledge graph includes almost 46 k nodes of nine types, and 157 k links of 30 types, connecting diseases, genes, symptoms, drugs, pathways, herbs, chemical, ingredients and patent medicine. CONCLUSIONS: Our Stroke-KG can provide practical and reliable stroke-related knowledge to help with stroke-related research like exploring new directions for stroke research and ideas for drug repurposing and discovery. We make StrokeKG freely available at http://114.115.208.144:7474/browser/ (Please click "Connect" directly) and the source structured data for stroke at https://github.com/yangxi1016/Stroke.


Assuntos
Medicamentos de Ervas Chinesas , Acidente Vascular Cerebral , Mineração de Dados , Medicamentos de Ervas Chinesas/uso terapêutico , Humanos , Medicina Tradicional Chinesa , Reconhecimento Automatizado de Padrão , Publicações , Acidente Vascular Cerebral/tratamento farmacológico , Acidente Vascular Cerebral/genética
18.
JMIR Med Inform ; 9(5): e24678, 2021 May 05.
Artigo em Inglês | MEDLINE | ID: mdl-33949962

RESUMO

BACKGROUND: Drug prescriptions are often recorded in free-text clinical narratives; making this information available in a structured form is important to support many health-related tasks. Although several natural language processing (NLP) methods have been proposed to extract such information, many challenges remain. OBJECTIVE: This study evaluates the feasibility of using NLP and deep learning approaches for extracting and linking drug names and associated attributes identified in clinical free-text notes and presents an extensive error analysis of different methods. This study initiated with the participation in the 2018 National NLP Clinical Challenges (n2c2) shared task on adverse drug events and medication extraction. METHODS: The proposed system (DrugEx) consists of a named entity recognizer (NER) to identify drugs and associated attributes and a relation extraction (RE) method to identify the relations between them. For NER, we explored deep learning-based approaches (ie, bidirectional long-short term memory with conditional random fields [BiLSTM-CRFs]) with various embeddings (ie, word embedding, character embedding [CE], and semantic-feature embedding) to investigate how different embeddings influence the performance. A rule-based method was implemented for RE and compared with a context-aware long-short term memory (LSTM) model. The methods were trained and evaluated using the 2018 n2c2 shared task data. RESULTS: The experiments showed that the best model (BiLSTM-CRFs with pretrained word embeddings [PWE] and CE) achieved lenient micro F-scores of 0.921 for NER, 0.927 for RE, and 0.855 for the end-to-end system. NER, which relies on the pretrained word and semantic embeddings, performed better on most individual entity types, but NER with PWE and CE had the highest classification efficiency among the proposed approaches. Extracting relations using the rule-based method achieved higher accuracy than the context-aware LSTM for most relations. Interestingly, the LSTM model performed notably better in the reason-drug relations, the most challenging relation type. CONCLUSIONS: The proposed end-to-end system achieved encouraging results and demonstrated the feasibility of using deep learning methods to extract medication information from free-text data.

19.
Drug Saf ; 44(5): 553-564, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-33582973

RESUMO

INTRODUCTION: Information on suspected adverse drug reactions (ADRs) voluntarily submitted by patients can be a valuable source of information for improving drug safety; however, public awareness of reporting mechanisms remains low. Whilst methods to automatically detect ADR mentions from social media posts using text mining techniques have been proposed to improve reporting rates, it is unclear how acceptable these would be to social media users. OBJECTIVE: The objective of this study was to explore public opinion about using automated methods to detect and report mentions of ADRs on social media to enhance pharmacovigilance efforts. METHODS: Users of the online health discussion forum HealthUnlocked participated in an online survey (N = 1359) about experiences with ADRs, knowledge of pharmacovigilance methods, and opinions about using automated data mining methods to detect and report ADRs. To further explore responses, five qualitative focus groups were conducted with 20 social media users with long-term health conditions. RESULTS: Participant responses indicated a low awareness of pharmacovigilance methods and ADR reporting. They showed a strong willingness to share health-related social media data about ADRs with researchers and regulators, but were cautious about automated text mining methods of detecting and reporting ADRs. CONCLUSIONS: Social media users value public-facing pharmacovigilance schemes, even if they do not understand the current framework of pharmacovigilance within the UK. Ongoing engagement with users is essential to understand views, share knowledge and respect users' privacy expectations to optimise future ADR reporting from online health communities.


Assuntos
Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Mídias Sociais , Sistemas de Notificação de Reações Adversas a Medicamentos , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/epidemiologia , Humanos , Farmacovigilância , Inquéritos e Questionários
20.
J Med Internet Res ; 23(2): e16348, 2021 02 16.
Artigo em Inglês | MEDLINE | ID: mdl-33591280

RESUMO

BACKGROUND: Social media provides the potential to engage a wide audience about scientific research, including the public. However, little empirical research exists to guide health scientists regarding what works and how to optimize impact. We examined the social media campaign #datasaveslives established in 2014 to highlight positive examples of the use and reuse of health data in research. OBJECTIVE: This study aims to examine how the #datasaveslives hashtag was used on social media, how often, and by whom; thus, we aim to provide insights into the impact of a major social media campaign in the UK health informatics research community and further afield. METHODS: We analyzed all publicly available posts (tweets) that included the hashtag #datasaveslives (N=13,895) on the microblogging platform Twitter between September 1, 2016, and August 31, 2017. Using a combination of qualitative and quantitative analyses, we determined the frequency and purpose of tweets. Social network analysis was used to analyze and visualize tweet sharing (retweet) networks among hashtag users. RESULTS: Overall, we found 4175 original posts and 9720 retweets featuring #datasaveslives by 3649 unique Twitter users. In total, 66.01% (2756/4175) of the original posts were retweeted at least once. Higher frequencies of tweets were observed during the weeks of prominent policy publications, popular conferences, and public engagement events. Cluster analysis based on retweet relationships revealed an interconnected series of groups of #datasaveslives users in academia, health services and policy, and charities and patient networks. Thematic analysis of tweets showed that #datasaveslives was used for a broader range of purposes than indexing information, including event reporting, encouraging participation and action, and showing personal support for data sharing. CONCLUSIONS: This study shows that a hashtag-based social media campaign was effective in encouraging a wide audience of stakeholders to disseminate positive examples of health research. Furthermore, the findings suggest that the campaign supported community building and bridging practices within and between the interdisciplinary sectors related to the field of health data science and encouraged individuals to demonstrate personal support for sharing health data.


Assuntos
Pesquisa Biomédica/métodos , Disseminação de Informação/métodos , Mídias Sociais/normas , Análise de Rede Social , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...