Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 151
Filter
1.
JMIR Res Protoc ; 13: e52411, 2024 Oct 09.
Article in English | MEDLINE | ID: mdl-39383523

ABSTRACT

BACKGROUND: Botswana has made significant investments in its health care information infrastructure, including vertical programs for child health and nutrition, HIV care, and tuberculosis. However, effectively integrating the more than 18 systems in place for data collection and reporting has proved to be challenging. The Botswana Health Data Collaborative Roadmap Strategy (2020-24) states that "there exists parallel reporting systems and data is not integrated into the mainstream reports at the national level," seconded by the Botswana National eLearning strategy (2020), which states that "there is inadequate information flow at all levels, proliferation of systems, reporting tools are not synthesized; hence too many systems are not communicating." OBJECTIVE: The objectives of this study are to (1) create a visual representation of how data are processed and the inputs and outputs through each health care system level; (2) understand how frontline workers perceive health care data sharing across existing platforms and the impact of data on health care service delivery. METHODS: The setting included a varied range of 30 health care facilities across Botswana, aiming to capture insights from multiple perspectives into data flow and system integration challenges. The study design combined qualitative and quantitative methodologies, informed by the rapid assessment process and the technology assessment model for resource limited settings. The study used a participatory research approach to ensure comprehensive stakeholder engagement from its inception. Survey instruments were designed to capture the intricacies of data processing, sharing, and integration among health care workers. A purposive sampling strategy was used to ensure a wide representation of participants across different health care roles and settings. Data collection used both digital surveys and in-depth interviews. Preliminary themes for analysis include perceptions of the value of health care data and experiences in data collection and sharing. Ethical approvals were comprehensively obtained, reflecting the commitment to uphold research integrity and participant welfare throughout the study. RESULTS: The study recruited almost 44 health care facilities, spanning a variety of health care facilities. Of the 44 recruited facilities, 27 responded to the surveys and participated in the interviews. A total of 75% (112/150) of health care professionals participating came from clinics, 20% (30/150) from hospitals, and 5% (8/150) from health posts and mobile clinics. As of October 10, 2023, the study had collected over 200 quantitative surveys and conducted 90 semistructured interviews. CONCLUSIONS: This study has so far shown enthusiastic engagement from the health care community, underscoring the relevance and necessity of this study's objectives. We believe the methodology, centered around extensive community engagement, is pivotal in capturing a nuanced understanding of the health care data ecosystem. The focus will now shift to the analysis phase of the study, with the aim of developing comprehensive recommendations for improving data flow within Botswana's health care system. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): DERR1-10.2196/52411.


Subject(s)
Delivery of Health Care , Botswana , Humans
2.
Article in English | MEDLINE | ID: mdl-39235405

ABSTRACT

Objectives: Randomized controlled trials (RCTs) have shown that attention-deficit/hyperactivity disorder (ADHD) medications significantly reduce symptomatology at a group level, but individual response to ADHD medication is variable. Thus, developing prediction models to stratify treatment according to individual baseline clinicodemographic characteristics is crucial to support clinical practice. A potential valuable source of data to develop accurate prediction models is real-world clinical data extracted from electronic healthcare records (EHRs). Yet, systematic information regarding EHR data on ADHD is lacking. Methods: We conducted a comprehensive review of studies that included EHR reporting data regarding individuals with ADHD, with a specific focus on treatment-related data. Relevant studies were identified from PubMed, Ovid, and Web of Science databases up to February 24, 2024. Results: We identified 103 studies reporting EHR data for individuals with ADHD. Among these, 83 studies provided information on the type of prescribed medication. However, dosage, duration of treatment, and ADHD symptom ratings before and after treatment initiation were only reported by a minority of studies. Conclusion: This review supports the potential use of EHRs to develop treatment response prediction models but emphasizes the need for more comprehensive reporting of treatment-related data, such as changes in ADHD symptom ratings and other possible baseline clinical predictors of treatment response.

3.
JMIR Med Inform ; 12: e58977, 2024 Sep 24.
Article in English | MEDLINE | ID: mdl-39316418

ABSTRACT

BACKGROUND: Natural language processing (NLP) techniques can be used to analyze large amounts of electronic health record texts, which encompasses various types of patient information such as quality of life, effectiveness of treatments, and adverse drug event (ADE) signals. As different aspects of a patient's status are stored in different types of documents, we propose an NLP system capable of processing 6 types of documents: physician progress notes, discharge summaries, radiology reports, radioisotope reports, nursing records, and pharmacist progress notes. OBJECTIVE: This study aimed to investigate the system's performance in detecting ADEs by evaluating the results from multitype texts. The main objective is to detect adverse events accurately using an NLP system. METHODS: We used data written in Japanese from 2289 patients with breast cancer, including medication data, physician progress notes, discharge summaries, radiology reports, radioisotope reports, nursing records, and pharmacist progress notes. Our system performs 3 processes: named entity recognition, normalization of symptoms, and aggregation of multiple types of documents from multiple patients. Among all patients with breast cancer, 103 and 112 with peripheral neuropathy (PN) received paclitaxel or docetaxel, respectively. We evaluate the utility of using multiple types of documents by correlation coefficient and regression analysis to compare their performance with each single type of document. All evaluations of detection rates with our system are performed 30 days after drug administration. RESULTS: Our system underestimates by 13.3 percentage points (74.0%-60.7%), as the incidence of paclitaxel-induced PN was 60.7%, compared with 74.0% in the previous research based on manual extraction. The Pearson correlation coefficient between the manual extraction and system results was 0.87 Although the pharmacist progress notes had the highest detection rate among each type of document, the rate did not match the performance using all documents. The estimated median duration of PN with paclitaxel was 92 days, whereas the previously reported median duration of PN with paclitaxel was 727 days. The number of events detected in each document was highest in the physician's progress notes, followed by the pharmacist's and nursing records. CONCLUSIONS: Considering the inherent cost that requires constant monitoring of the patient's condition, such as the treatment of PN, our system has a significant advantage in that it can immediately estimate the treatment duration without fine-tuning a new NLP model. Leveraging multitype documents is better than using single-type documents to improve detection performance. Although the onset time estimation was relatively accurate, the duration might have been influenced by the length of the data follow-up period. The results suggest that our method using various types of data can detect more ADEs from clinical documents.


Subject(s)
Electronic Health Records , Natural Language Processing , Humans , Retrospective Studies , Japan , Breast Neoplasms/pathology , Breast Neoplasms/drug therapy , Female , Drug-Related Side Effects and Adverse Reactions/diagnosis , Drug-Related Side Effects and Adverse Reactions/epidemiology , East Asian People
4.
JAMIA Open ; 7(3): ooae084, 2024 Oct.
Article in English | MEDLINE | ID: mdl-39282083

ABSTRACT

Objective: Electronic health records (EHRs) provide opportunities for the development of computable predictive tools. Conventional machine learning methods and deep learning methods have been widely used for this task, with the approach of usually designing one tool for one clinical outcome. Here we developed PheW2P2V, a Phenome-Wide prediction framework using Weighted Patient Vectors. PheW2P2V conducts tailored predictions for phenome-wide phenotypes using numeric representations of patients' past medical records weighted based on their similarities with individual phenotypes. Materials and Methods: PheW2P2V defines clinical disease phenotypes using Phecode mapping based on International Classification of Disease codes, which reduces redundancy and case-control misclassification in real-life EHR datasets. Through upweighting medical records of patients that are more relevant to a phenotype of interest in calculating patient vectors, PheW2P2V achieves tailored incidence risk prediction of a phenotype. The calculation of weighted patient vectors is computationally efficient, and the weighting mechanism ensures tailored predictions across the phenome. We evaluated prediction performance of PheW2P2V and baseline methods with simulation studies and clinical applications using the MIMIC-III database. Results: Across 942 phenome-wide predictions using the MIMIC-III database, PheW2P2V has median area under the receiver operating characteristic curve (AUC-ROC) 0.74 (baseline methods have values ≤0.72), median max F1-score 0.20 (baseline methods have values ≤0.19), and median area under the precision-recall curve (AUC-PR) 0.10 (baseline methods have values ≤0.10). Discussion: PheW2P2V can predict phenotypes efficiently by using medical concept embeddings and upweighting relevant past medical histories. By leveraging both labeled and unlabeled data, PheW2P2V reduces overfitting and improves predictions for rare phenotypes, making it a useful screening tool for early diagnosis of high-risk conditions, though further research is needed to assess the transferability of embeddings across different databases. Conclusions: PheW2P2V is fast, flexible, and has superior prediction performance for many clinical disease phenotypes across the phenome of the MIMIC-III database compared to that of several popular baseline methods.

5.
JMIR Med Inform ; 12: e57195, 2024 Sep 10.
Article in English | MEDLINE | ID: mdl-39255011

ABSTRACT

BACKGROUND: Postoperative infections remain a crucial challenge in health care, resulting in high morbidity, mortality, and costs. Accurate identification and labeling of patients with postoperative bacterial infections is crucial for developing prediction models, validating biomarkers, and implementing surveillance systems in clinical practice. OBJECTIVE: This scoping review aimed to explore methods for identifying patients with postoperative infections using electronic health record (EHR) data to go beyond the reference standard of manual chart review. METHODS: We performed a systematic search strategy across PubMed, Embase, Web of Science (Core Collection), the Cochrane Library, and Emcare (Ovid), targeting studies addressing the prediction and fully automated surveillance (ie, without manual check) of diverse bacterial infections in the postoperative setting. For prediction modeling studies, we assessed the labeling methods used, categorizing them as either manual or automated. We evaluated the different types of EHR data needed for the surveillance and labeling of postoperative infections, as well as the performance of fully automated surveillance systems compared with manual chart review. RESULTS: We identified 75 different methods and definitions used to identify patients with postoperative infections in studies published between 2003 and 2023. Manual labeling was the predominant method in prediction modeling research, 65% (49/75) of the identified methods use structured data, and 45% (34/75) use free text and clinical notes as one of their data sources. Fully automated surveillance systems should be used with caution because the reported positive predictive values are between 0.31 and 0.76. CONCLUSIONS: There is currently no evidence to support fully automated labeling and identification of patients with infections based solely on structured EHR data. Future research should focus on defining uniform definitions, as well as prioritizing the development of more scalable, automated methods for infection detection using structured EHR data.

6.
JMIR Perioper Med ; 7: e63076, 2024 Sep 13.
Article in English | MEDLINE | ID: mdl-39269754

ABSTRACT

BACKGROUND: Preoperative cardiac risk assessment is an integral part of preoperative evaluation; however, there is significant variation among providers, leading to inappropriate referrals for cardiology consultation or excessive low-value cardiac testing. We implemented a novel electronic medical record (EMR) form in our preoperative clinics to decrease variation. OBJECTIVE: This study aimed to investigate the impact of the EMR form on the preoperative utilization of cardiology consultation and cardiac diagnostic testing (echocardiograms, stress tests, and cardiac catheterization) and evaluate postoperative outcomes. METHODS: A retrospective cohort study was conducted. Patients who underwent outpatient preoperative evaluation prior to an elective surgery over 2 years were divided into 2 cohorts: from July 1, 2021, to June 30, 2022 (pre-EMR form implementation), and from July 1, 2022, to June 30, 2023 (post-EMR form implementation). Demographics, comorbidities, resource utilization, and surgical characteristics were analyzed. Propensity score matching was used to adjust for differences between the 2 cohorts. The primary outcomes were the utilization of preoperative cardiology consultation, cardiac testing, and 30-day postoperative major adverse cardiac events (MACE). RESULTS: A total of 25,484 patients met the inclusion criteria. Propensity score matching yielded 11,645 well-matched pairs. The post-EMR form, matched cohort had lower cardiology consultation (pre-EMR form: n=2698, 23.2% vs post-EMR form: n=2088, 17.9%; P<.001) and echocardiogram (pre-EMR form: n=808, 6.9% vs post-EMR form: n=591, 5.1%; P<.001) utilization. There were no significant differences in the 30-day postoperative outcomes, including MACE (all P>.05). While patients with "possible indications" for cardiology consultation had higher MACE rates, the consultations did not reduce MACE risk. Most algorithm end points, except for active cardiac conditions, had MACE rates <1%. CONCLUSIONS: In this cohort study, preoperative cardiac risk assessment using a novel EMR form was associated with a significant decrease in cardiology consultation and testing utilization, with no adverse impact on postoperative outcomes. Adopting this approach may assist perioperative medicine clinicians and anesthesiologists in efficiently decreasing unnecessary preoperative resource utilization without compromising patient safety or quality of care.

7.
JMIR Med Inform ; 12: e59858, 2024 Sep 13.
Article in English | MEDLINE | ID: mdl-39270211

ABSTRACT

BACKGROUND: Hereditary angioedema (HAE), a rare genetic disease, induces acute attacks of swelling in various regions of the body. Its prevalence is estimated to be 1 in 50,000 people, with no reported bias among different ethnic groups. However, considering the estimated prevalence, the number of patients in Japan diagnosed with HAE remains approximately 1 in 250,000, which means that only 20% of potential HAE cases are identified. OBJECTIVE: This study aimed to develop an artificial intelligence (AI) model that can detect patients with suspected HAE using medical history data (medical claims, prescriptions, and electronic medical records [EMRs]) in the United States. We also aimed to validate the detection performance of the model for HAE cases using the Japanese dataset. METHODS: The HAE patient and control groups were identified using the US claims and EMR datasets. We analyzed the characteristics of the diagnostic history of patients with HAE and developed an AI model to predict the probability of HAE based on a generalized linear model and bootstrap method. The model was then applied to the EMR data of the Kyoto University Hospital to verify its applicability to the Japanese dataset. RESULTS: Precision and sensitivity were measured to validate the model performance. Using the comprehensive US dataset, the precision score was 2% in the initial model development step. Our model can screen out suspected patients, where 1 in 50 of these patients have HAE. In addition, in the validation step with Japanese EMR data, the precision score was 23.6%, which exceeded our expectations. We achieved a sensitivity score of 61.5% for the US dataset and 37.6% for the validation exercise using data from a single Japanese hospital. Overall, our model could predict patients with typical HAE symptoms. CONCLUSIONS: This study indicates that our AI model can detect HAE in patients with typical symptoms and is effective in Japanese data. However, further prospective clinical studies are required to investigate whether this model can be used to diagnose HAE.

8.
BMC Med Inform Decis Mak ; 24(1): 231, 2024 Aug 21.
Article in English | MEDLINE | ID: mdl-39169338

ABSTRACT

BACKGROUND: Electronic health records (EHRs) are currently gaining popularity in emerging economies because they provide options for exchanging patient data, increasing operational efficiency, and improving patient outcomes. This study examines how service providers at Ghana's Komfo Anokye Teaching Hospital adopt and use an electronic health records (EHRs) system. The emphasis is on identifying factors impacting adoption and the problems that healthcare personnel encounter in efficiently using the EHRs system. METHOD: A quantitative cross-sectional technique was utilised to collect data from 234 trauma and emergency department staff members via standardised questionnaires. The participants were selected using the purposive sampling method. The Pearson Chi-square Test was used to examine the relationship between respondents' acceptability and use of EHRs. RESULTS: The study discovered that a sizable number of respondents (86.8%) embraced and actively used the EHRs system. However, other issues were noted, including insufficient system training and malfunctions (35.9%), power outages (18.8%), privacy concerns (9.4%), and insufficient maintenance (4.7%). The respondents' comfortability in using the electronic health record system (X2=11.30, p=0.001), system dependability (X2=30.74, p=0.0001), and EHR's ability to reduce patient waiting time (X2=14.39, p=0.0001) were all strongly associated with their degree of satisfaction with the system. Furthermore, respondents who said elects increase patient care (X2= 75.59, p = 0.0001) and income creation (X2= 8.48, p = 0.004), which is related to the acceptability of the electronic health records system. CONCLUSION: The study revealed that comfort, reliability, and improved care quality all had an impact on the EHRs system's acceptability and utilization. Challenges, including equipment malfunctions and power outages, were found. Continuous professional training was emphasized as a means of increasing employee confidence, as did the construction of a power backup system to combat disruptions. Patient data privacy was highlighted. In conclusion, this study highlights the relevance of EHRs system adoption and usability in healthcare. While the benefits are obvious, addressing obstacles through training, technical support, and infrastructure improvements is critical for increasing system effectiveness.


Subject(s)
Electronic Health Records , Emergency Service, Hospital , Hospitals, Teaching , Ghana , Humans , Cross-Sectional Studies , Adult , Female , Male , Middle Aged , Attitude of Health Personnel , Surveys and Questionnaires
9.
Article in English | MEDLINE | ID: mdl-39127052

ABSTRACT

OBJECTIVES: To address the need for interactive visualization tools and databases in characterizing multimorbidity patterns across different populations, we developed the Phenome-wide Multi-Institutional Multimorbidity Explorer (PheMIME). This tool leverages three large-scale EHR systems to facilitate efficient analysis and visualization of disease multimorbidity, aiming to reveal both robust and novel disease associations that are consistent across different systems and to provide insight for enhancing personalized healthcare strategies. MATERIALS AND METHODS: PheMIME integrates summary statistics from phenome-wide analyses of disease multimorbidities, utilizing data from Vanderbilt University Medical Center, Mass General Brigham, and the UK Biobank. It offers interactive and multifaceted visualizations for exploring multimorbidity. Incorporating an enhanced version of associationSubgraphs, PheMIME also enables dynamic analysis and inference of disease clusters, promoting the discovery of complex multimorbidity patterns. A case study on schizophrenia demonstrates its capability for generating interactive visualizations of multimorbidity networks within and across multiple systems. Additionally, PheMIME supports diverse multimorbidity-based discoveries, detailed further in online case studies. RESULTS: The PheMIME is accessible at https://prod.tbilab.org/PheMIME/. A comprehensive tutorial and multiple case studies for demonstration are available at https://prod.tbilab.org/PheMIME_supplementary_materials/. The source code can be downloaded from https://github.com/tbilab/PheMIME. DISCUSSION: PheMIME represents a significant advancement in medical informatics, offering an efficient solution for accessing, analyzing, and interpreting the complex and noisy real-world patient data in electronic health records. CONCLUSION: PheMIME provides an extensive multimorbidity knowledge base that consolidates data from three EHR systems, and it is a novel interactive tool designed to analyze and visualize multimorbidities across multiple EHR datasets. It stands out as the first of its kind to offer extensive multimorbidity knowledge integration with substantial support for efficient online analysis and interactive visualization.

10.
JMIR Med Inform ; 12: e57153, 2024 Aug 19.
Article in English | MEDLINE | ID: mdl-39158950

ABSTRACT

BACKGROUND: Leveraging electronic health record (EHR) data for clinical or research purposes heavily depends on data fitness. However, there is a lack of standardized frameworks to evaluate EHR data suitability, leading to inconsistent quality in data use projects (DUPs). This research focuses on the Medical Informatics for Research and Care in University Medicine (MIRACUM) Data Integration Centers (DICs) and examines empirical practices on assessing and automating the fitness-for-purpose of clinical data in German DIC settings. OBJECTIVE: The study aims (1) to capture and discuss how MIRACUM DICs evaluate and enhance the fitness-for-purpose of observational health care data and examine the alignment with existing recommendations and (2) to identify the requirements for designing and implementing a computer-assisted solution to evaluate EHR data fitness within MIRACUM DICs. METHODS: A qualitative approach was followed using an open-ended survey across DICs of 10 German university hospitals affiliated with MIRACUM. Data were analyzed using thematic analysis following an inductive qualitative method. RESULTS: All 10 MIRACUM DICs participated, with 17 participants revealing various approaches to assessing data fitness, including the 4-eyes principle and data consistency checks such as cross-system data value comparison. Common practices included a DUP-related feedback loop on data fitness and using self-designed dashboards for monitoring. Most experts had a computer science background and a master's degree, suggesting strong technological proficiency but potentially lacking clinical or statistical expertise. Nine key requirements for a computer-assisted solution were identified, including flexibility, understandability, extendibility, and practicability. Participants used heterogeneous data repositories for evaluating data quality criteria and practical strategies to communicate with research and clinical teams. CONCLUSIONS: The study identifies gaps between current practices in MIRACUM DICs and existing recommendations, offering insights into the complexities of assessing and reporting clinical data fitness. Additionally, a tripartite modular framework for fitness-for-purpose assessment was introduced to streamline the forthcoming implementation. It provides valuable input for developing and integrating an automated solution across multiple locations. This may include statistical comparisons to advanced machine learning algorithms for operationalizing frameworks such as the 3×3 data quality assessment framework. These findings provide foundational evidence for future design and implementation studies to enhance data quality assessments for specific DUPs in observational health care settings.

11.
JMIR Med Inform ; 12: e56734, 2024 Aug 27.
Article in English | MEDLINE | ID: mdl-39189917

ABSTRACT

Background: Increasing and substantial reliance on electronic health records (EHRs) and data types (ie, diagnosis, medication, and laboratory data) demands assessment of their data quality as a fundamental approach, especially since there is a need to identify appropriate denominator populations with chronic conditions, such as type 2 diabetes (T2D), using commonly available computable phenotype definitions (ie, phenotypes). Objective: To bridge this gap, our study aims to assess how issues of EHR data quality and variations and robustness (or lack thereof) in phenotypes may have potential impacts in identifying denominator populations. Methods: Approximately 208,000 patients with T2D were included in our study, which used retrospective EHR data from the Johns Hopkins Medical Institution (JHMI) during 2017-2019. Our assessment included 4 published phenotypes and 1 definition from a panel of experts at Hopkins. We conducted descriptive analyses of demographics (ie, age, sex, race, and ethnicity), use of health care (inpatient and emergency room visits), and the average Charlson Comorbidity Index score of each phenotype. We then used different methods to induce or simulate data quality issues of completeness, accuracy, and timeliness separately across each phenotype. For induced data incompleteness, our model randomly dropped diagnosis, medication, and laboratory codes independently at increments of 10%; for induced data inaccuracy, our model randomly replaced a diagnosis or medication code with another code of the same data type and induced 2% incremental change from -100% to +10% in laboratory result values; and lastly, for timeliness, data were modeled for induced incremental shift of date records by 30 days to 365 days. Results: Less than a quarter (n=47,326, 23%) of the population overlapped across all phenotypes using EHRs. The population identified by each phenotype varied across all combinations of data types. Induced incompleteness identified fewer patients with each increment; for example, at 100% diagnostic incompleteness, the Chronic Conditions Data Warehouse phenotype identified zero patients, as its phenotypic characteristics included only diagnosis codes. Induced inaccuracy and timeliness similarly demonstrated variations in performance of each phenotype, therefore resulting in fewer patients being identified with each incremental change. Conclusions: We used EHR data with diagnosis, medication, and laboratory data types from a large tertiary hospital system to understand T2D phenotypic differences and performance. We used induced data quality methods to learn how data quality issues may impact identification of the denominator populations upon which clinical (eg, clinical research and trials, population health evaluations) and financial or operational decisions are made. The novel results from our study may inform future approaches to shaping a common T2D computable phenotype definition that can be applied to clinical informatics, managing chronic conditions, and additional industry-wide efforts in health care.

12.
Stud Health Technol Inform ; 316: 1390-1395, 2024 Aug 22.
Article in English | MEDLINE | ID: mdl-39176640

ABSTRACT

Syntactic interoperability among health ICT systems is well-established, but achieving semantic interoperability requires more than just exchanging codes. We present a pragmatic, safe, and effective approach towards an ambitious goal: enabling any software to process a critical mass of routine clinical information in a replicable manner across various systems and local contexts. We advocate for the use of reliable, parsimonious coding to handle the most replicable aspects of data processing for routine patient information, while reserving the intricate interpretation of individual patient data nuances for skilled professionals, possibly supported by Artificial Intelligence tools. We suggest coping with routine tasks by focusing on a limited set of a few thousand data elements, named the 'Clinical Documentation Kernel' (CDK). This approach will provide direct benefits to users and assist in the human interpretation of other patient information. Our preliminary study focuses on the 'primitives' and 'qualifiers' that bring the highest value to the health ecosystem in various authoritative scenarios in the field of diabetes.


Subject(s)
Electronic Health Records , Semantics , Humans , Artificial Intelligence , Health Information Interoperability
13.
Stud Health Technol Inform ; 316: 611-615, 2024 Aug 22.
Article in English | MEDLINE | ID: mdl-39176816

ABSTRACT

Secure extraction of Personally Identifiable Information (PII) from Electronic Health Records (EHRs) presents significant privacy and security challenges. This study explores the application of Federated Learning (FL) to overcome these challenges within the context of French EHRs. By utilizing a multilingual BERT model in an FL simulation involving 20 hospitals, each represented by a unique medical department or pole, we compared the performance of two setups: individual models, where each hospital uses only its own training and validation data without engaging in the FL process, and federated models, where multiple hospitals collaborate to train a global FL model. Our findings demonstrate that FL models not only preserve data confidentiality but also outperform the individual models. In fact, the Global FL model achieved an F1 score of 75,7%, slightly comparable to that of the Centralized approach at 78,5%. This research underscores the potential of FL in extracting PIIs from EHRs, encouraging its broader adoption in health data analysis.


Subject(s)
Computer Security , Confidentiality , Electronic Health Records , Machine Learning , France , Humans , Health Records, Personal
14.
JMIR Med Inform ; 12: e52934, 2024 Jun 27.
Article in English | MEDLINE | ID: mdl-38973192

ABSTRACT

Background: The traditional clinical trial data collection process requires a clinical research coordinator who is authorized by the investigators to read from the hospital's electronic medical record. Using electronic source data opens a new path to extract patients' data from electronic health records (EHRs) and transfer them directly to an electronic data capture (EDC) system; this method is often referred to as eSource. eSource technology in a clinical trial data flow can improve data quality without compromising timeliness. At the same time, improved data collection efficiency reduces clinical trial costs. Objective: This study aims to explore how to extract clinical trial-related data from hospital EHR systems, transform the data into a format required by the EDC system, and transfer it into sponsors' environments, and to evaluate the transferred data sets to validate the availability, completeness, and accuracy of building an eSource dataflow. Methods: A prospective clinical trial study registered on the Drug Clinical Trial Registration and Information Disclosure Platform was selected, and the following data modules were extracted from the structured data of 4 case report forms: demographics, vital signs, local laboratory data, and concomitant medications. The extracted data was mapped and transformed, deidentified, and transferred to the sponsor's environment. Data validation was performed based on availability, completeness, and accuracy. Results: In a secure and controlled data environment, clinical trial data was successfully transferred from a hospital EHR to the sponsor's environment with 100% transcriptional accuracy, but the availability and completeness of the data could be improved. Conclusions: Data availability was low due to some required fields in the EDC system not being available directly in the EHR. Some data is also still in an unstructured or paper-based format. The top-level design of the eSource technology and the construction of hospital electronic data standards should help lay a foundation for a full electronic data flow from EHRs to EDC systems in the future.

15.
BMC Psychiatry ; 24(1): 481, 2024 Jul 02.
Article in English | MEDLINE | ID: mdl-38956493

ABSTRACT

BACKGROUND: Patients' online record access (ORA) enables patients to read and use their health data through online digital solutions. One such solution, patient-accessible electronic health records (PAEHRs) have been implemented in Estonia, Finland, Norway, and Sweden. While accumulated research has pointed to many potential benefits of ORA, its application in mental healthcare (MHC) continues to be contested. The present study aimed to describe MHC users' overall experiences with national PAEHR services. METHODS: The study analysed the MHC-part of the NORDeHEALTH 2022 Patient Survey, a large-scale multi-country survey. The survey consisted of 45 questions, including demographic variables and questions related to users' experiences with ORA. We focused on the questions concerning positive experiences (benefits), negative experiences (errors, omissions, offence), and breaches of security and privacy. Participants were included in this analysis if they reported receiving mental healthcare within the past two years. Descriptive statistics were used to summarise data, and percentages were calculated on available data. RESULTS: 6,157 respondents were included. In line with previous research, almost half (45%) reported very positive experiences with ORA. A majority in each country also reported improved trust (at least 69%) and communication (at least 71%) with healthcare providers. One-third (29.5%) reported very negative experiences with ORA. In total, half of the respondents (47.9%) found errors and a third (35.5%) found omissions in their medical documentation. One-third (34.8%) of all respondents also reported being offended by the content. When errors or omissions were identified, about half (46.5%) reported that they took no action. There seems to be differences in how patients experience errors, omissions, and missing information between the countries. A small proportion reported instances where family or others demanded access to their records (3.1%), and about one in ten (10.7%) noted that unauthorised individuals had seen their health information. CONCLUSIONS: Overall, MHC patients reported more positive experiences than negative, but a large portion of respondents reported problems with the content of the PAEHR. Further research on best practice in implementation of ORA in MHC is therefore needed, to ensure that all patients may reap the benefits while limiting potential negative consequences.


Subject(s)
Electronic Health Records , Mental Health Services , Humans , Electronic Health Records/statistics & numerical data , Male , Female , Adult , Middle Aged , Estonia , Norway , Finland , Mental Health Services/statistics & numerical data , Sweden , Surveys and Questionnaires , Young Adult , Aged , Patient Access to Records , Adolescent
16.
Comput Biol Med ; 179: 108830, 2024 Sep.
Article in English | MEDLINE | ID: mdl-38991321

ABSTRACT

Undiagnosed and untreated human immunodeficiency virus (HIV) infection increases morbidity in the HIV-positive person and allows onward transmission of the virus. Minimizing missed opportunities for HIV diagnosis when a patient visits a healthcare facility is essential in restraining the epidemic and working toward its eventual elimination. Most state-of-the-art proposals employ machine learning (ML) methods and structured data to enhance HIV diagnoses, however, there is a dearth of recent proposals utilizing unstructured textual data from Electronic Health Records (EHRs). In this work, we propose to use only the unstructured text of the clinical notes as evidence for the classification of patients as suspected or not suspected. For this purpose, we first compile a dataset of real clinical notes from a hospital with patients classified as suspects and non-suspects of having HIV. Then, we evaluate the effectiveness of two types of classification models to identify patients suspected of being infected with the virus: classical ML algorithms and two Large Language Models (LLMs) from the biomedical domain in Spanish. The results show that both LLMs outperform classical ML algorithms in the two settings we explore: one dataset version is balanced, containing an equal number of suspicious and non-suspicious patients, while the other reflects the real distribution of patients in the hospital, being unbalanced. We obtain F1 score figures of 94.7 with both LLMs in the unbalanced setting, while in the balance one, RoBERTaBio model outperforms the other one with a F1 score of 95.7. The findings indicate that leveraging unstructured text with LLMs in the biomedical domain yields promising outcomes in diminishing missed opportunities for HIV diagnosis. A tool based on our system could assist a doctor in deciding whether a patient in consultation should undergo a serological test.


Subject(s)
Data Mining , Electronic Health Records , HIV Infections , Machine Learning , Humans , HIV Infections/diagnosis , Data Mining/methods , Early Diagnosis , Male , Female , Algorithms
17.
Article in English | MEDLINE | ID: mdl-39003521

ABSTRACT

OBJECTIVES: We introduce a widely applicable model-based approach for estimating individual-level Social Determinants of Health (SDoH) and evaluate its effectiveness using the All of Us Research Program. MATERIALS AND METHODS: Our approach utilizes aggregated SDoH datasets to estimate individual-level SDoH, demonstrated with examples of no high school diploma (NOHSDP) and no health insurance (UNINSUR) variables. Models are estimated using American Community Survey data and applied to derive individual-level estimates for All of Us participants. We assess concordance between model-based SDoH estimates and self-reported SDoHs in All of Us and examine associations with undiagnosed hypertension and diabetes. RESULTS: Compared to self-reported SDoHs, the area under the curve for NOHSDP is 0.727 (95% CI, 0.724-0.730) and for UNINSUR is 0.730 (95% CI, 0.727-0.733) among the 329 074 All of Us participants, both significantly higher than aggregated SDoHs. The association between model-based NOHSDP and undiagnosed hypertension is concordant with those estimated using self-reported NOHSDP, with a correlation coefficient of 0.649. Similarly, the association between model-based NOHSDP and undiagnosed diabetes is concordant with those estimated using self-reported NOHSDP, with a correlation coefficient of 0.900. DISCUSSION AND CONCLUSION: The model-based SDoH estimation method offers a scalable and easily standardized approach for estimating individual-level SDoHs. Using the All of Us dataset, we demonstrate reasonable concordance between model-based SDoH estimates and self-reported SDoHs, along with consistent associations with health outcomes. Our findings also underscore the critical role of geographic contexts in SDoH estimation and in evaluating the association between SDoHs and health outcomes.

18.
Online J Public Health Inform ; 16: e58058, 2024 Jul 03.
Article in English | MEDLINE | ID: mdl-38959056

ABSTRACT

BACKGROUND: Population viral load (VL), the most comprehensive measure of the HIV transmission potential, cannot be directly measured due to lack of complete sampling of all people with HIV. OBJECTIVE: A given HIV clinic's electronic health record (EHR), a biased sample of this population, may be used to attempt to impute this measure. METHODS: We simulated a population of 10,000 individuals with VL calibrated to surveillance data with a geometric mean of 4449 copies/mL. We sampled 3 hypothetical EHRs from (A) the source population, (B) those diagnosed, and (C) those retained in care. Our analysis imputed population VL from each EHR using sampling weights followed by Bayesian adjustment. These methods were then tested using EHR data from an HIV clinic in Delaware. RESULTS: Following weighting, the estimates moved in the direction of the population value with correspondingly wider 95% intervals as follows: clinic A: 4364 (95% interval 1963-11,132) copies/mL; clinic B: 4420 (95% interval 1913-10,199) copies/mL; and clinic C: 242 (95% interval 113-563) copies/mL. Bayesian-adjusted weighting further improved the estimate. CONCLUSIONS: These findings suggest that methodological adjustments are ineffective for estimating population VL from a single clinic's EHR without the resource-intensive elucidation of an informative prior.

19.
JMIR Ment Health ; 11: e57965, 2024 Jun 06.
Article in English | MEDLINE | ID: mdl-38860592

ABSTRACT

Background: In many countries, health care professionals are legally obliged to share information from electronic health records with patients. However, concerns have been raised regarding the sharing of notes with adolescents in mental health care, and health care professionals have called for recommendations to guide this practice. Objective: The aim was to reach a consensus among authors of scientific papers on recommendations for health care professionals' digital sharing of notes with adolescents in mental health care and to investigate whether staff at child and adolescent specialist mental health care clinics agreed with the recommendations. Methods: A Delphi study was conducted with authors of scientific papers to reach a consensus on recommendations. The process of making the recommendations involved three steps. First, scientific papers meeting the eligibility criteria were identified through a PubMed search where the references were screened. Second, the results from the included papers were coded and transformed into recommendations in an iterative process. Third, the authors of the included papers were asked to provide feedback and consider their agreement with each of the suggested recommendations in two rounds. After the Delphi process, a cross-sectional study was conducted among staff at specialist child and adolescent mental health care clinics to assess whether they agreed with the recommendations that reached a consensus. Results: Of the 84 invited authors, 27 responded. A consensus was reached on 17 recommendations on areas related to digital sharing of notes with adolescents in mental health care. The recommendations considered how to introduce digital access to notes, write notes, and support health care professionals, and when to withhold notes. Of the 41 staff members at child and adolescent specialist mental health care clinics, 60% or more agreed with the 17 recommendations. No consensus was reached regarding the age at which adolescents should receive digital access to their notes and the timing of digitally sharing notes with parents. Conclusions: A total of 17 recommendations related to key aspects of health care professionals' digital sharing of notes with adolescents in mental health care achieved consensus. Health care professionals can use these recommendations to guide their practice of sharing notes with adolescents in mental health care. However, the effects and experiences of following these recommendations should be tested in clinical practice.


Subject(s)
Delphi Technique , Mental Health Services , Humans , Adolescent , Mental Health Services/standards , Electronic Health Records , Consensus , Cross-Sectional Studies , Female , Male
20.
Artif Intell Med ; 154: 102903, 2024 08.
Article in English | MEDLINE | ID: mdl-38908257

ABSTRACT

Irregular sampling of time series in electronic health records (EHRs) is one of the main challenges for developing machine learning models. Additionally, the pattern of missing values in certain clinical variables is not at random but depends on the decisions of clinicians and the state of the patient. Point process is a mathematical framework for analyzing event sequence data consistent with irregular sampling patterns. Our model, TEE4EHR, is a transformer event encoder (TEE) with point process loss that encodes the pattern of laboratory tests in EHRs. The utility of our TEE has been investigated in various benchmark event sequence datasets. Additionally, we conduct experiments on two real-world EHR databases to provide a more comprehensive evaluation of our model. Firstly, in a self-supervised learning approach, the TEE is jointly learned with an existing attention-based deep neural network, which gives superior performance in negative log-likelihood and future event prediction. Besides, we propose an algorithm for aggregating attention weights to reveal the events' interactions. Secondly, we transfer and freeze the learned TEE to the downstream task for the outcome prediction, where it outperforms state-of-the-art models for handling irregularly sampled time series. Furthermore, our results demonstrate that our approach can improve representation learning in EHRs and be useful for clinical prediction tasks.


Subject(s)
Electronic Health Records , Humans , Neural Networks, Computer , Machine Learning , Algorithms , Databases, Factual , Deep Learning
SELECTION OF CITATIONS
SEARCH DETAIL