Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 1.671
Filter
1.
Data Brief ; 55: 110545, 2024 Aug.
Article in English | MEDLINE | ID: mdl-38952954

ABSTRACT

This dataset involves a collection of soybean market news through web scraping from a Brazilian website. The news articles gathered span from January 2015 to June 2023 and have undergone a labeling process to categorize them as relevant or non-relevant. The news labeling process was conducted under the guidance of an agricultural economics expert, who collaborated with a group of nine individuals. Ten parameters were considered to assist participants in the labeling process. The dataset comprises approximately 11,000 news articles and serves as a valuable resource for researchers interested in exploring trends in the soybean market. Importantly, this dataset can be utilized for tasks such as classification and natural language processing. It provides insights into labeled soybean market news and supports open science initiatives, facilitating further analysis within the research community.

2.
Heliyon ; 10(12): e32093, 2024 Jun 30.
Article in English | MEDLINE | ID: mdl-38948047

ABSTRACT

Chinese agricultural named entity recognition (NER) has been studied with supervised learning for many years. However, considering the scarcity of public datasets in the agricultural domain, exploring this task in the few-shot scenario is more practical for real-world demands. In this paper, we propose a novel model named GlyReShot, integrating the knowledge of Chinese character glyph into few-shot NER models. Although the utilization of glyph has been proven successful in supervised models, two challenges still persist in the few-shot setting, i.e., how to obtain glyph representations and when to integrate them into the few-shot model. GlyReShot handles the two challenges by introducing a lightweight glyph representation obtaining module and a training-free label refinement strategy. Specifically, the glyph representations are generated based on the descriptive sentences by filling the predefined template. As most steps come before training, this module aligns well with the few-shot setting. Furthermore, by computing the confidence values for draft predictions, the refinement strategy selectively utilizes the glyph information only when the confidence values are relatively low, thus mitigating the influence of noise. Finally, we annotate a new agricultural NER dataset and the experimental results demonstrate effectiveness of GlyReShot for few-shot Chinese agricultural NER.

3.
PeerJ ; 12: e17470, 2024.
Article in English | MEDLINE | ID: mdl-38948230

ABSTRACT

TIN-X (Target Importance and Novelty eXplorer) is an interactive visualization tool for illuminating associations between diseases and potential drug targets and is publicly available at newdrugtargets.org. TIN-X uses natural language processing to identify disease and protein mentions within PubMed content using previously published tools for named entity recognition (NER) of gene/protein and disease names. Target data is obtained from the Target Central Resource Database (TCRD). Two important metrics, novelty and importance, are computed from this data and when plotted as log(importance) vs. log(novelty), aid the user in visually exploring the novelty of drug targets and their associated importance to diseases. TIN-X Version 3.0 has been significantly improved with an expanded dataset, modernized architecture including a REST API, and an improved user interface (UI). The dataset has been expanded to include not only PubMed publication titles and abstracts, but also full-text articles when available. This results in approximately 9-fold more target/disease associations compared to previous versions of TIN-X. Additionally, the TIN-X database containing this expanded dataset is now hosted in the cloud via Amazon RDS. Recent enhancements to the UI focuses on making it more intuitive for users to find diseases or drug targets of interest while providing a new, sortable table-view mode to accompany the existing plot-view mode. UI improvements also help the user browse the associated PubMed publications to explore and understand the basis of TIN-X's predicted association between a specific disease and a target of interest. While implementing these upgrades, computational resources are balanced between the webserver and the user's web browser to achieve adequate performance while accommodating the expanded dataset. Together, these advances aim to extend the duration that users can benefit from TIN-X while providing both an expanded dataset and new features that researchers can use to better illuminate understudied proteins.


Subject(s)
User-Computer Interface , Humans , Natural Language Processing , PubMed , Software
4.
J Cheminform ; 16(1): 76, 2024 Jul 02.
Article in English | MEDLINE | ID: mdl-38956728

ABSTRACT

Materials science is an interdisciplinary field that studies the properties, structures, and behaviors of different materials. A large amount of scientific literature contains rich knowledge in the field of materials science, but manually analyzing these papers to find material-related data is a daunting task. In information processing, named entity recognition (NER) plays a crucial role as it can automatically extract entities in the field of materials science, which have significant value in tasks such as building knowledge graphs. The typically used sequence labeling methods for traditional named entity recognition in material science (MatNER) tasks often fail to fully utilize the semantic information in the dataset and cannot effectively extract nested entities. Herein, we proposed to convert the sequence labeling task into a machine reading comprehension (MRC) task. MRC method effectively can solve the challenge of extracting multiple overlapping entities by transforming it into the form of answering multiple independent questions. Moreover, the MRC framework allows for a more comprehensive understanding of the contextual information and semantic relationships within materials science literature, by integrating prior knowledge from queries. State-of-the-art (SOTA) performance was achieved on the Matscholar, BC4CHEMD, NLMChem, SOFC, and SOFC-Slot datasets, with F1-scores of 89.64%, 94.30%, 85.89%, 85.95%, and 71.73%, respectively in MRC approach. By effectively utilizing semantic information and extracting nested entities, this approach holds great significance for knowledge extraction and data analysis in the field of materials science, and thus accelerating the development of material science.Scientific contributionWe have developed an innovative NER method that enhances the efficiency and accuracy of automatic entity extraction in the field of materials science by transforming the sequence labeling task into a MRC task, this approach provides robust support for constructing knowledge graphs and other data analysis tasks.

5.
Data Brief ; 55: 110628, 2024 Aug.
Article in English | MEDLINE | ID: mdl-39006354

ABSTRACT

Climate security refers to the risks posed by climate change on nations, societies, and individuals, including the possibility of conflicts. As an emerging field of research and public debate, where conceptual definitions are not yet fully agreed upon, gaining insights into global discussions on climate security enables systematizing its various interpretations and framings, mapping thematic priorities, and understanding information gaps that need to be filled. Considering Twitter as an important digital forum for information exchanges and dialogue, the dataset was created through the development of a query strategy based on a snowball scraping technique, which collected tweets containing hashtags related to climate security between January 2014 to May 2023. The dataset comprises 636,379 tweets. Content analysis was performed using text mining and network analysis techniques to generate additional data on sentiment, countries mentioned in the body of tweets, and hashtag co-occurrences. With almost 10 years of data, the utility of this dataset lies in the ability to assess the discursive evolution of a particular topic since its inception.

6.
Syst Rev ; 13(1): 174, 2024 Jul 08.
Article in English | MEDLINE | ID: mdl-38978132

ABSTRACT

BACKGROUND: The demand for high-quality systematic literature reviews (SRs) for evidence-based medical decision-making is growing. SRs are costly and require the scarce resource of highly skilled reviewers. Automation technology has been proposed to save workload and expedite the SR workflow. We aimed to provide a comprehensive overview of SR automation studies indexed in PubMed, focusing on the applicability of these technologies in real world practice. METHODS: In November 2022, we extracted, combined, and ran an integrated PubMed search for SRs on SR automation. Full-text English peer-reviewed articles were included if they reported studies on SR automation methods (SSAM), or automated SRs (ASR). Bibliographic analyses and knowledge-discovery studies were excluded. Record screening was performed by single reviewers, and the selection of full text papers was performed in duplicate. We summarized the publication details, automated review stages, automation goals, applied tools, data sources, methods, results, and Google Scholar citations of SR automation studies. RESULTS: From 5321 records screened by title and abstract, we included 123 full text articles, of which 108 were SSAM and 15 ASR. Automation was applied for search (19/123, 15.4%), record screening (89/123, 72.4%), full-text selection (6/123, 4.9%), data extraction (13/123, 10.6%), risk of bias assessment (9/123, 7.3%), evidence synthesis (2/123, 1.6%), assessment of evidence quality (2/123, 1.6%), and reporting (2/123, 1.6%). Multiple SR stages were automated by 11 (8.9%) studies. The performance of automated record screening varied largely across SR topics. In published ASR, we found examples of automated search, record screening, full-text selection, and data extraction. In some ASRs, automation fully complemented manual reviews to increase sensitivity rather than to save workload. Reporting of automation details was often incomplete in ASRs. CONCLUSIONS: Automation techniques are being developed for all SR stages, but with limited real-world adoption. Most SR automation tools target single SR stages, with modest time savings for the entire SR process and varying sensitivity and specificity across studies. Therefore, the real-world benefits of SR automation remain uncertain. Standardizing the terminology, reporting, and metrics of study reports could enhance the adoption of SR automation techniques in real-world practice.


Subject(s)
Automation , PubMed , Systematic Reviews as Topic , Humans
7.
Comput Biol Med ; 179: 108830, 2024 Jul 10.
Article in English | MEDLINE | ID: mdl-38991321

ABSTRACT

Undiagnosed and untreated human immunodeficiency virus (HIV) infection increases morbidity in the HIV-positive person and allows onward transmission of the virus. Minimizing missed opportunities for HIV diagnosis when a patient visits a healthcare facility is essential in restraining the epidemic and working toward its eventual elimination. Most state-of-the-art proposals employ machine learning (ML) methods and structured data to enhance HIV diagnoses, however, there is a dearth of recent proposals utilizing unstructured textual data from Electronic Health Records (EHRs). In this work, we propose to use only the unstructured text of the clinical notes as evidence for the classification of patients as suspected or not suspected. For this purpose, we first compile a dataset of real clinical notes from a hospital with patients classified as suspects and non-suspects of having HIV. Then, we evaluate the effectiveness of two types of classification models to identify patients suspected of being infected with the virus: classical ML algorithms and two Large Language Models (LLMs) from the biomedical domain in Spanish. The results show that both LLMs outperform classical ML algorithms in the two settings we explore: one dataset version is balanced, containing an equal number of suspicious and non-suspicious patients, while the other reflects the real distribution of patients in the hospital, being unbalanced. We obtain F1 score figures of 94.7 with both LLMs in the unbalanced setting, while in the balance one, RoBERTaBio model outperforms the other one with a F1 score of 95.7. The findings indicate that leveraging unstructured text with LLMs in the biomedical domain yields promising outcomes in diminishing missed opportunities for HIV diagnosis. A tool based on our system could assist a doctor in deciding whether a patient in consultation should undergo a serological test.

8.
Microorganisms ; 12(6)2024 Jun 19.
Article in English | MEDLINE | ID: mdl-38930619

ABSTRACT

Bacterial endocarditis (BE) is a severe infection of the endocardium and cardiac valves caused by bacterial agents in dogs. Diagnosis of endocarditis is challenging due to the variety of clinical presentations and lack of definitive diagnostic tests in its early stages. This study aims to provide a research literature analysis on BE in dogs based on text mining (TM) and topic analysis (TA) identifying dominant topics, summarizing their temporal trend, and highlighting any possible research gaps. A literature search was performed utilizing the Scopus® database, employing keywords pertaining to BE to analyze papers published in English from 1990 to 2023. The investigation followed a systematic approach based on the PRISMA guidelines. A total of 86 records were selected for analysis following screening procedures and underwent descriptive statistics, TM, and TA. The findings revealed that the number of records published per year has increased in 2007 and 2021. TM identified the words with the highest term frequency-inverse document frequency (TF-IDF), and TA highlighted the main research areas, in the following order: causative agents, clinical findings and predisposing factors, case reports on endocarditis, outcomes and biomarkers, and infective endocarditis and bacterial isolation. The study confirms the increasing interest in BE but shows where further studies are needed.

9.
Fukushima J Med Sci ; 2024 Jun 26.
Article in English | MEDLINE | ID: mdl-38925959

ABSTRACT

BACKGROUND: We previously reported the impact of general practice/family medicine training on postgraduate training in Japan using evaluation criteria standardized nationwide. However, there is a possibility that new insights may be gained by analyzing the reflective reports written by these residents. METHODS: Junior residents who participated in one-month general practice/family medicine training at one of five medical institutions with full-time family medicine specialists between 2019 and 2022 were enrolled in this study. They were assigned to submit a reflective report on their experiences and thoughts every day during the training. We analyzed these reflective writings using text mining and created a co-occurrence network map to see the relationship between the most frequently used words. RESULTS: Ninety junior residents participated in the study. The words that appeared most frequently in the sentences referring to clinical ability included "symptoms," "medical examination," "consultation," "treatment," and "examination." The words of "family" and "(patient) oneself" showed strong association in the co-occurrence network map. CONCLUSION: It was suggested that general practice/family medicine training greatly contributes to the acquisition of clinical abilities and deepens the learning of junior residents not only about patient care but also about family-oriented care.

10.
J Anim Sci ; 1022024 Jan 03.
Article in English | MEDLINE | ID: mdl-38850056

ABSTRACT

Automated Milking Systems (AMS) have undergone significant evolution over the past 30 yr, and their adoption continues to increase, as evidenced by the growing scientific literature. These systems offer advantages such as a reduced milking workload and increased milk yield per cow. However, given concerns about the welfare of farmed animals, studying the effects of AMS on the health and welfare of animals becomes crucial for the overall sustainability of the dairy sector. In the last few years, some analysis conducted through text mining (TM) and topic analysis (TA) approaches have become increasingly widespread in the livestock sector. The aim of the study was to analyze the scientific literature on the impact of AMS on dairy cow health, welfare, and behavior: the paper aimed to produce a comprehensive analysis on this topic using TM and TA approaches. After a preprocessing phase, a dataset of 427 documents was analyzed. The abstracts of the selected papers were analyzed by TM and a TA using Software R 4.3.1. A Term Frequency-Inverse Document Frequency (TFIDF) technique was used to assign a relative weight to each term. According to the results of the TM, the ten most important terms, both words and roots, were feed, farm, teat, concentr, mastiti, group, SCC (somatic cell count), herd, lame and pasture. The 10 most important terms showed TFIDF values greater than 3.5, with feed showing a value of TFIDF of 5.43 and pasture of 3.66. Eight topics were selected with TA, namely: 1) Cow traffic and time budget, 2) Farm management, 3) Udder health, 4) Comparison with conventional milking, 5) Milk production, 6) Analysis of AMS data, 7) Disease detection, 8) Feeding management. Over the years, the focus of documents has shifted from cow traffic, udder health and cow feeding to the analysis of data recorded by the robot to monitor animal conditions and welfare and promptly identify the onset of stress or diseases. The analysis reveals the complex nature of the relationship between AMS and animal welfare, health, and behavior: on one hand, the robot offers interesting opportunities to safeguard animal welfare and health, especially for the possibility of early identification of anomalous conditions using sensors and data; on the other hand, it poses potential risks, which requires further investigations. TM offers an alternative approach to information retrieval in livestock science, especially when dealing with a substantial volume of documents.


Milking robots have revolutionized the cow milking, reducing dependence on human labor and increasing milk yield per cow. However, addressing concerns about farmed animal welfare and overall sustainability is crucial. This paper presents a text-mining analysis of the scientific literature to explore the effects of robotic milking on cow health, welfare, and behavior. The analysis revealed a growing body of research studies on these subjects, highlighting the complex nature of the relationship between automated milking, welfare, health, and cow behavior. Robotic milking has the potential to enhance animal health and living conditions, but the associated risks require further investigation.


Subject(s)
Animal Welfare , Dairying , Data Mining , Animals , Cattle/physiology , Dairying/methods , Female , Behavior, Animal/physiology
11.
Heliyon ; 10(11): e31626, 2024 Jun 15.
Article in English | MEDLINE | ID: mdl-38841475

ABSTRACT

Understanding public emotion on social media about community wellness is crucial for enhancing health awareness and guiding policy-making. In order to more fully mine the deep contextual semantical information of short texts and further enhance the effectiveness of emotion prediction in social media, we propose the Deep Parallel Contextual Analysis Framework (DPCAF) in the community wellness domain, specifically addressing the challenges of limited text length and available semantical features in social media text. Specifically, at the embedding layer, we first utilize two different word embedding techniques to generate high-quality vector representations, aiming to achieve more comprehensive semantical capture, stronger generalization ability, and more robust model performance. Subsequently, in the deep contextual layer, the obtained representations are fused with POS and locational representations, and processed through a deep parallel layer composed of Convolutional Neural Networks and Bidirectional Long Short-Term Memory Network. An attention model is then used to further extract semantical features of social media texts. Finally, these deep parallel contextual representations are post-integrated for emotion prediction. Experiments on a dataset collected from social media regarding community wellness demonstrate that compared to benchmark models, DPCAF achieves at least a 4.81 % increase in Precision, a 3.44 % increase in Recall, and a 10.81 % increase in F1-score. Relative to the most advanced models, DPCAF shows a minimum improvement of 2.65 % in Precision, 3.02 % in Recall, and 2.53 % in F1-score.

12.
Math Biosci Eng ; 21(4): 5411-5429, 2024 Mar 08.
Article in English | MEDLINE | ID: mdl-38872541

ABSTRACT

Currently, with the rapid growth of online media, more people are obtaining information from it. However, traditional hotspot mining algorithms cannot achieve precise and fast control of hot topics. Aiming at the problem of poor accuracy and timeliness in current news media hotspot mining methods, this paper proposes a hotspot mining method based on the co-occurrence word model. First, a new co-occurrence word model based on word weight is proposed. Then, for key phrase extraction, a hotspot mining algorithm based on the co-occurrence word model and improved smooth inverse frequency rank (SIFRANK) is designed. Finally, the Spark computing framework is introduced to improve the computing efficiency. The experimental outcomes expresses that the new word discovery algorithm discovered 16871 and 17921 new words in the Weibo Short News and Weibo Short Text datasets respectively. The heat weight values of the keywords obtained by the improved SIFRANK reaches 0.9356, 0.9991, and 0.6117. In the Covid19 Tweets dataset, the accuracy is 0.6223, the recall is 0.7015, and the F1 value is 0.6605. In the President-elects Tweets dataset, the accuracy is 0.6418, the recall is 0.7162, and the F1 value is 0.6767. After applying the Spark computing framework, the running speed has significantly improved. The text mining news media hotspot mining method based on the co-occurrence word model proposed in this study has improved the accuracy and efficiency of mining hot topics, and has great practical significance.

13.
Health Informatics J ; 30(2): 14604582241260644, 2024.
Article in English | MEDLINE | ID: mdl-38873836

ABSTRACT

The use of telemedicine and telehealth has rapidly increased since the start of the COVID-19 pandemic, however, could lead to unnecessary medical service. This study analyzes the contents of telemedicine apps (applications) in South Korea to investigate the use of telemedicine for selective or unnecessary medical treatments and the presence of advertising for the hospital. This study analyzed 49 telemedicine mobile apps in Korea; a content analysis of the apps' features and quality using a Mobile Application Rating Scale was done. The study analyzed 49 mobile telemedicine apps and found that 65.3% of the apps provide immediate telemedicine service without reservations, with an average rating of 4.35. 87% of the apps offered selective care, but the overall quality of the apps was low, with an average total quality score of 3.27. 73.9% of the apps were able to provide selective care for alopecia or morning-after pill prescription, 65.2% of the apps for weight loss, and 52.2% of the apps for erectile dysfunction, with the potential to encourage medical inducement or abuse. Therefore, before introducing telemedicine, it is helpful to prevent the possibility of abuse of telemedicine by establishing detailed policies for methods and scope of telemedicine.


Subject(s)
COVID-19 , Mobile Applications , Telemedicine , Humans , Republic of Korea , Telemedicine/statistics & numerical data , COVID-19/epidemiology , Mobile Applications/standards , Mobile Applications/trends , Mobile Applications/statistics & numerical data , SARS-CoV-2 , Pandemics
14.
J Safety Res ; 89: 91-104, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38858066

ABSTRACT

INTRODUCTION: Workplace accidents in the petroleum industry can cause catastrophic damage to people, property, and the environment. Earlier studies in this domain indicate that the majority of the accident report information is available in unstructured text format. Conventional techniques for the analysis of accident data are time-consuming and heavily dependent on experts' subject knowledge, experience, and judgment. There is a need to develop a machine learning-based decision support system to analyze the vast amounts of unstructured text data that are frequently overlooked due to a lack of appropriate methodology. METHOD: To address this gap in the literature, we propose a hybrid methodology that uses improved text-mining techniques combined with an un-bias group decision-making framework to combine the output of objective weights (based on text mining) and subjective weights (based on expert opinion) of risk factors to prioritize them. Based on the contextual word embedding models and term frequencies, we extracted five important clusters of risk factors comprising more than 32 risk sub-factors. A heterogeneous group of experts and employees in the petroleum industry were contacted to obtain their opinions on the extracted risk factors, and the best-worst method was used to convert their opinions to weights. CONCLUSIONS AND PRACTICAL APPLICATIONS: The applicability of our proposed framework was tested on the data compiled from the accident data released by the petroleum industries in India. Our framework can be extended to accident data from any industry, to reduce analysis time and improve the accuracy in classifying and prioritizing risk factors.


Subject(s)
Accidents, Occupational , Data Mining , Risk Management , Humans , Accidents, Occupational/prevention & control , Risk Management/methods , Data Mining/methods , India , Consensus , Risk Factors , Oil and Gas Industry , Machine Learning , Decision Support Techniques
15.
Comput Biol Med ; 178: 108721, 2024 Jun 19.
Article in English | MEDLINE | ID: mdl-38901188

ABSTRACT

Since the 2000s, digitalization has been a crucial transformation in our lives. Nevertheless, digitalization brings a bulk of unstructured textual data to be processed, including articles, clinical records, web pages, and shared social media posts. As a critical analysis, the classification task classifies the given textual entities into correct categories. Categorizing documents from different domains is straightforward since the instances are unlikely to contain similar contexts. However, document classification in a single domain is more complicated due to sharing the same context. Thus, we aim to classify medical articles about four common cancer types (Leukemia, Non-Hodgkin Lymphoma, Bladder Cancer, and Thyroid Cancer) by constructing machine learning and deep learning models. We used 383,914 medical articles about four common cancer types collected by the PubMed API. To build classification models, we split the dataset into 70% as training, 20% as testing, and 10% as validation. We built widely used machine-learning (Logistic Regression, XGBoost, CatBoost, and Random Forest Classifiers) and modern deep-learning (convolutional neural networks - CNN, long short-term memory - LSTM, and gated recurrent unit - GRU) models. We computed the average classification performances (precision, recall, F-score) to evaluate the models over ten distinct dataset splits. The best-performing deep learning model(s) yielded a superior F1 score of 98%. However, traditional machine learning models also achieved reasonably high F1 scores, 95% for the worst-performing case. Ultimately, we constructed multiple models to classify articles, which compose a hard-to-classify dataset in the medical domain.

16.
Tob Induc Dis ; 222024.
Article in English | MEDLINE | ID: mdl-38887599

ABSTRACT

Tobacco consumption in China remains the primary cause of preventable mortality, with Shanghai being particularly affected by issues related to secondhand smoke exposure. This study explores the role of the public service hotline 12345, a grassroots initiative in Shanghai, in capturing public sentiment and assessing the effectiveness of anti-smoking regulations. Our research aims to accurately and deeply understand the implementation and feedback of smoking control policies: by identifying high-frequency points and prominent issues in smoking control work based on the smoking control work order data received by the health hotline 12320. The results of this study will assist government enforcement agencies in improving smoking monitoring and clarify the direction for improving smoking control measures. Text-mining techniques were employed to analyze a dataset comprising 78011 call sheets, all related to tobacco control and collected from the hotline between 1 January 2015 and 31 December 2019. This methodological approach aims to uncover prevalent themes and sentiments in the public discourse on smoking and its regulation, as reflected in the hotline interactions. Our study identified hotspots and the issues of greatest concern to citizens. Additionally, it provided recommendations to enforcement agencies to enhance their capabilities, optimize the allocation of human resources for smoking control monitoring, reduce enforcement costs and support for anti-smoking campaigns, thereby contributing to more effective tobacco control policies in the region.

17.
JMIR Form Res ; 8: e48520, 2024 Jun 07.
Article in English | MEDLINE | ID: mdl-38848120

ABSTRACT

BACKGROUND: Current evidence reveals a growing pattern of hypertension among young adults, significantly increasing their risk for cardiovascular disease later in life. Young adults, particularly those of college age, often develop risk factors related to lifestyle choices in diet, exercise, and alcohol consumption. Developing useful interventions that can assist with screening and possible behavioral modifications that are suitable and appealing to college-aged young adults could help with early identification and intervention for hypertension. Recent studies indicate mobile health (mHealth) apps are acceptable and effective for communication and message delivery among this population. OBJECTIVE: The purpose of this study was to examine the feasibility of using a mobile smartphone delivery system that provides tailored messages based on participant self-measured blood pressure (BP) with college-aged young adults. METHODS: Using a single-arm intervention, pilot study design, the mHealth to Optimize BP Improvement (MOBILE) intervention was implemented with college students aged 18 years to 39 years who had systolic BP >120 mm Hg and diastolic BP ≥80 mm Hg. Participants were required to measure their BP daily for 28 days, submit the readings to the app, and receive preset educational text messages tailored to their BP value and related to encouraging healthy lifestyle modifications. Changes in a participant's BP was evaluated using a mixed regression model, and a postintervention survey evaluated their perspectives on the mHealth intervention. RESULTS: The participants' (N=9) mean age was 22.64 (SD 4.54) years; 56% (5/9) were overweight, and 11% (1/9) were obese. The average daily participation rate was 86%. Of the 9 participants, 8 completed the survey, and all indicated the intervention was easy to use, found it increased awareness of their individual BP levels, indicated the text messages were helpful, and reported making lifestyle changes based on the study intervention. They also provided suggestions for future implementation of the intervention and program. Overall, no significant changes were noted in BP over the 28 days. CONCLUSIONS: The mHealth-supported MOBILE intervention for BP monitoring and tailored text messaging was feasible to implement, as our study indicated high rates of participation and acceptability. These encouraging findings support further development and testing in a larger sample over a longer time frame and hold the potential for early identification and intervention among college-aged adults, filling a gap in current research.

18.
Gigascience ; 132024 Jan 02.
Article in English | MEDLINE | ID: mdl-38832465

ABSTRACT

BACKGROUND: As the number of genome-wide association study (GWAS) and quantitative trait locus (QTL) mappings in rice continues to grow, so does the already long list of genomic loci associated with important agronomic traits. Typically, loci implicated by GWAS/QTL analysis contain tens to hundreds to thousands of single-nucleotide polmorphisms (SNPs)/genes, not all of which are causal and many of which are in noncoding regions. Unraveling the biological mechanisms that tie the GWAS regions and QTLs to the trait of interest is challenging, especially since it requires collating functional genomics information about the loci from multiple, disparate data sources. RESULTS: We present RicePilaf, a web app for post-GWAS/QTL analysis, that performs a slew of novel bioinformatics analyses to cross-reference GWAS results and QTL mappings with a host of publicly available rice databases. In particular, it integrates (i) pangenomic information from high-quality genome builds of multiple rice varieties, (ii) coexpression information from genome-scale coexpression networks, (iii) ontology and pathway information, (iv) regulatory information from rice transcription factor databases, (v) epigenomic information from multiple high-throughput epigenetic experiments, and (vi) text-mining information extracted from scientific abstracts linking genes and traits. We demonstrate the utility of RicePilaf by applying it to analyze GWAS peaks of preharvest sprouting and genes underlying yield-under-drought QTLs. CONCLUSIONS: RicePilaf enables rice scientists and breeders to shed functional light on their GWAS regions and QTLs, and it provides them with a means to prioritize SNPs/genes for further experiments. The source code, a Docker image, and a demo version of RicePilaf are publicly available at https://github.com/bioinfodlsu/rice-pilaf.


Subject(s)
Data Mining , Genome-Wide Association Study , Oryza , Quantitative Trait Loci , Oryza/genetics , Software , Epigenomics/methods , Computational Biology/methods , Polymorphism, Single Nucleotide , Genomics/methods , Genome, Plant , Chromosome Mapping , Databases, Genetic
19.
J Am Med Inform Assoc ; 31(8): 1725-1734, 2024 Aug 01.
Article in English | MEDLINE | ID: mdl-38934643

ABSTRACT

OBJECTIVE: To explore the feasibility of validating Dutch concept extraction tools using annotated corpora translated from English, focusing on preserving annotations during translation and addressing the scarcity of non-English annotated clinical corpora. MATERIALS AND METHODS: Three annotated corpora were standardized and translated from English to Dutch using 2 machine translation services, Google Translate and OpenAI GPT-4, with annotations preserved through a proposed method of embedding annotations in the text before translation. The performance of 2 concept extraction tools, MedSpaCy and MedCAT, was assessed across the corpora in both Dutch and English. RESULTS: The translation process effectively generated Dutch annotated corpora and the concept extraction tools performed similarly in both English and Dutch. Although there were some differences in how annotations were preserved across translations, these did not affect extraction accuracy. Supervised MedCAT models consistently outperformed unsupervised models, whereas MedSpaCy demonstrated high recall but lower precision. DISCUSSION: Our validation of Dutch concept extraction tools on corpora translated from English was successful, highlighting the efficacy of our annotation preservation method and the potential for efficiently creating multilingual corpora. Further improvements and comparisons of annotation preservation techniques and strategies for corpus synthesis could lead to more efficient development of multilingual corpora and accurate non-English concept extraction tools. CONCLUSION: This study has demonstrated that translated English corpora can be used to validate non-English concept extraction tools. The annotation preservation method used during translation proved effective, and future research can apply this corpus translation method to additional languages and clinical settings.


Subject(s)
Translating , Netherlands , Natural Language Processing , Humans , Language , Data Mining/methods
20.
Res Nurs Health ; 2024 Jun 01.
Article in English | MEDLINE | ID: mdl-38824392

ABSTRACT

The coronavirus disease (COVID-19) pandemic has negatively affected research activities across various fields. This study aimed to determine nursing researchers' concerns about research activities during the COVID-19 pandemic in Japan and subsequent changes brought on by it. For this study, we conducted descriptive statistics and text mining analyses using data from two surveys conducted by the Japan Academy of Nursing Science (JANS) in the early days of the pandemic (first survey: mid-2020) and after 2 years (second survey: early 2022). Concerns about research activities were observed in 89% and 80% of the nursing researchers in the first and second surveys, respectively. Furthermore, concerns about "Difficulty in collecting research data" and "Content and quality of your research" were stronger in the second survey. Text mining analyses revealed that in the first survey, they were concerned about environmental changes and restrictions when proceeding with research during the COVID-19 pandemic, which was unfamiliar at the time. In the second survey, after overcoming environmental changes in the early stages of the pandemic, nursing researchers' concerns shifted to anxiety about the future, such as concerns about degree acquisition, employment and career advancement, and research results. The current study highlights various concerns among nursing researchers regarding research activities that have evolved over time during the pandemic. Academic societies must flexibly construct support measures for nursing researchers when a new infectious disease occurs. Such measures should be sensitive to the prevailing social circumstances and the evolving needs of researchers.

SELECTION OF CITATIONS
SEARCH DETAIL
...