Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 21
Filter
1.
J Am Med Inform Assoc ; 29(12): 2105-2109, 2022 11 14.
Article in English | MEDLINE | ID: mdl-36305781

ABSTRACT

Healthcare systems are hampered by incomplete and fragmented patient health records. Record linkage is widely accepted as a solution to improve the quality and completeness of patient records. However, there does not exist a systematic approach for manually reviewing patient records to create gold standard record linkage data sets. We propose a robust framework for creating and evaluating manually reviewed gold standard data sets for measuring the performance of patient matching algorithms. Our 8-point approach covers data preprocessing, blocking, record adjudication, linkage evaluation, and reviewer characteristics. This framework can help record linkage method developers provide necessary transparency when creating and validating gold standard reference matching data sets. In turn, this transparency will support both the internal and external validity of recording linkage studies and improve the robustness of new record linkage strategies.


Subject(s)
Health Records, Personal , Medical Record Linkage , Humans , Medical Record Linkage/methods , Algorithms , Information Storage and Retrieval , Data Collection
2.
AMIA Jt Summits Transl Sci Proc ; 2021: 335-344, 2021.
Article in English | MEDLINE | ID: mdl-34457148

ABSTRACT

Restrictions in sharing Patient Health Identifiers (PHI) limit cross-organizational re-use of free-text medical data. We leverage Generative Adversarial Networks (GAN) to produce synthetic unstructured free-text medical data with low re-identification risk, and assess the suitability of these datasets to replicate machine learning models. We trained GAN models using unstructured free-text laboratory messages pertaining to salmonella, and identified the most accurate models for creating synthetic datasets that reflect the informational characteristics of the original dataset. Natural Language Generation metrics comparing the real and synthetic datasets demonstrated high similarity. Decision models generated using these datasets reported high performance metrics. There was no statistically significant difference in performance measures reported by models trained using real and synthetic datasets. Our results inform the use of GAN models to generate synthetic unstructured free-text data with limited re-identification risk, and use of this data to enable collaborative research and re-use of machine learning models.


Subject(s)
Machine Learning , Text Messaging , Humans
3.
J Med Internet Res ; 23(7): e28812, 2021 07 26.
Article in English | MEDLINE | ID: mdl-34156964

ABSTRACT

BACKGROUND: The COVID-19 pandemic has changed public health policies and human and community behaviors through lockdowns and mandates. Governments are rapidly evolving policies to increase hospital capacity and supply personal protective equipment and other equipment to mitigate disease spread in affected regions. Current models that predict COVID-19 case counts and spread are complex by nature and offer limited explainability and generalizability. This has highlighted the need for accurate and robust outbreak prediction models that balance model parsimony and performance. OBJECTIVE: We sought to leverage readily accessible data sets extracted from multiple states to train and evaluate a parsimonious predictive model capable of identifying county-level risk of COVID-19 outbreaks on a day-to-day basis. METHODS: Our modeling approach leveraged the following data inputs: COVID-19 case counts per county per day and county populations. We developed an outbreak gold standard across California, Indiana, and Iowa. The model utilized a per capita running 7-day sum of the case counts per county per day and the mean cumulative case count to develop baseline values. The model was trained with data recorded between March 1 and August 31, 2020, and tested on data recorded between September 1 and October 31, 2020. RESULTS: The model reported sensitivities of 81%, 92%, and 90% for California, Indiana, and Iowa, respectively. The precision in each state was above 85% while specificity and accuracy scores were generally >95%. CONCLUSIONS: Our parsimonious model provides a generalizable and simple alternative approach to outbreak prediction. This methodology can be applied to diverse regions to help state officials and hospitals with resource allocation and to guide risk management, community education, and mitigation strategies.


Subject(s)
COVID-19/epidemiology , Computer Simulation , Datasets as Topic , Disease Outbreaks/statistics & numerical data , Forecasting/methods , Heuristics , Public Sector , COVID-19/prevention & control , California/epidemiology , Humans , Indiana/epidemiology , Iowa/epidemiology , Models, Biological , SARS-CoV-2
4.
Am J Manag Care ; 27(1): e24-e31, 2021 01 01.
Article in English | MEDLINE | ID: mdl-33471465

ABSTRACT

OBJECTIVES: Health care organizations are increasingly employing social workers to address patients' social needs. However, social work (SW) activities in health care settings are largely captured as text data within electronic health records (EHRs), making measurement and analysis difficult. This study aims to extract and classify, from EHR notes, interventions intended to address patients' social needs using natural language processing (NLP) and machine learning (ML) algorithms. STUDY DESIGN: Secondary data analysis of a longitudinal cohort. METHODS: We extracted 815 SW encounter notes from the EHR system of a federally qualified health center. We reviewed the literature to derive a 10-category classification scheme for SW interventions. We applied NLP and ML algorithms to categorize the documented SW interventions in EHR notes according to the 10-category classification scheme. RESULTS: Most of the SW notes (n = 598; 73.4%) contained at least 1 SW intervention. The most frequent interventions offered by social workers included care coordination (21.5%), education (21.0%), financial planning (18.5%), referral to community services and organizations (17.1%), and supportive counseling (15.3%). High-performing classification algorithms included the kernelized support vector machine (SVM) (accuracy, 0.97), logistic regression (accuracy, 0.96), linear SVM (accuracy, 0.95), and multinomial naive Bayes classifier (accuracy, 0.92). CONCLUSIONS: NLP and ML can be utilized for automated identification and classification of SW interventions documented in EHRs. Health care administrators can leverage this automated approach to gain better insight into the most needed social interventions in the patient population served by their organizations. Such information can be applied in managerial decisions related to SW staffing, resource allocation, and patients' social needs.


Subject(s)
Electronic Health Records , Natural Language Processing , Bayes Theorem , Humans , Machine Learning , Social Work
5.
JMIR Med Inform ; 8(7): e16129, 2020 Jul 09.
Article in English | MEDLINE | ID: mdl-32479414

ABSTRACT

BACKGROUND: Emerging interest in precision health and the increasing availability of patient- and population-level data sets present considerable potential to enable analytical approaches to identify and mitigate the negative effects of social factors on health. These issues are not satisfactorily addressed in typical medical care encounters, and thus, opportunities to improve health outcomes, reduce costs, and improve coordination of care are not realized. Furthermore, methodological expertise on the use of varied patient- and population-level data sets and machine learning to predict need for supplemental services is limited. OBJECTIVE: The objective of this study was to leverage a comprehensive range of clinical, behavioral, social risk, and social determinants of health factors in order to develop decision models capable of identifying patients in need of various wraparound social services. METHODS: We used comprehensive patient- and population-level data sets to build decision models capable of predicting need for behavioral health, dietitian, social work, or other social service referrals within a safety-net health system using area under the receiver operating characteristic curve (AUROC), sensitivity, precision, F1 score, and specificity. We also evaluated the value of population-level social determinants of health data sets in improving machine learning performance of the models. RESULTS: Decision models for each wraparound service demonstrated performance measures ranging between 59.2%% and 99.3%. These results were statistically superior to the performance measures demonstrated by our previous models which used a limited data set and whose performance measures ranged from 38.2% to 88.3% (behavioural health: F1 score P<.001, AUROC P=.01; social work: F1 score P<.001, AUROC P=.03; dietitian: F1 score P=.001, AUROC P=.001; other: F1 score P=.01, AUROC P=.02); however, inclusion of additional population-level social determinants of health did not contribute to any performance improvements (behavioural health: F1 score P=.08, AUROC P=.09; social work: F1 score P=.16, AUROC P=.09; dietitian: F1 score P=.08, AUROC P=.14; other: F1 score P=.33, AUROC P=.21) in predicting the need for referral in our population of vulnerable patients seeking care at a safety-net provider. CONCLUSIONS: Precision health-enabled decision models that leverage a wide range of patient- and population-level data sets and advanced machine learning methods are capable of predicting need for various wraparound social services with good performance.

6.
AMIA Jt Summits Transl Sci Proc ; 2020: 152-161, 2020.
Article in English | MEDLINE | ID: mdl-32477634

ABSTRACT

Healthcare analytics is impeded by a lack of machine learning (ML) model generalizability, the ability of a model to predict accurately on varied data sources not included in the model's training dataset. We leveraged free-text laboratory data from a Health Information Exchange network to evaluate ML generalization using Notifiable Condition Detection (NCD) for public health surveillance as a use case. We 1) built ML models for detecting syphilis, salmonella, and histoplasmosis; 2) evaluated generalizability of these models across data from holdout lab systems, and; 3) explored factors that influence weak model generalizability. Models for predicting each disease reported considerable accuracy. However, they demonstrated poor generalizability across data from holdout lab systems being tested. Our evaluation determined that weak generalization was influenced by variant syntactic nature of free-text datasets across each lab system. Results highlight the need for actionable methodology to generalize ML solutions for healthcare analytics.

7.
PLoS One ; 15(1): e0226718, 2020.
Article in English | MEDLINE | ID: mdl-31910437

ABSTRACT

BACKGROUND AND PURPOSE: Hemorrhagic transformation (HT) after cerebral infarction is a complex and multifactorial phenomenon in the acute stage of ischemic stroke, and often results in a poor prognosis. Thus, identifying risk factors and making an early prediction of HT in acute cerebral infarction contributes not only to the selections of therapeutic regimen but also, more importantly, to the improvement of prognosis of acute cerebral infarction. The purpose of this study was to develop and validate a model to predict a patient's risk of HT within 30 days of initial ischemic stroke. METHODS: We utilized a retrospective multicenter observational cohort study design to develop a Lasso Logistic Regression prediction model with a large, US Electronic Health Record dataset which structured to the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM). To examine clinical transportability, the model was externally validated across 10 additional real-world healthcare datasets include EHR records for patients from America, Europe and Asia. RESULTS: In the database the model was developed, the target population cohort contained 621,178 patients with ischemic stroke, of which 5,624 patients had HT within 30 days following initial ischemic stroke. 612 risk predictors, including the distance a patient travels in an ambulance to get to care for a HT, were identified. An area under the receiver operating characteristic curve (AUC) of 0.75 was achieved in the internal validation of the risk model. External validation was performed across 10 databases totaling 5,515,508 patients with ischemic stroke, of which 86,401 patients had HT within 30 days following initial ischemic stroke. The mean external AUC was 0.71 and ranged between 0.60-0.78. CONCLUSIONS: A HT prognostic predict model was developed with Lasso Logistic Regression based on routinely collected EMR data. This model can identify patients who have a higher risk of HT than the population average with an AUC of 0.78. It shows the OMOP CDM is an appropriate data standard for EMR secondary use in clinical multicenter research for prognostic prediction model development and validation. In the future, combining this model with clinical information systems will assist clinicians to make the right therapy decision for patients with acute ischemic stroke.


Subject(s)
Brain Ischemia/complications , Cerebral Hemorrhage/diagnosis , Models, Statistical , Risk Assessment/methods , Stroke/complications , Cerebral Hemorrhage/etiology , Female , Follow-Up Studies , Humans , Male , Middle Aged , Prognosis , ROC Curve , Retrospective Studies , Risk Factors
8.
Stud Health Technol Inform ; 264: 1510-1511, 2019 Aug 21.
Article in English | MEDLINE | ID: mdl-31438206

ABSTRACT

We leverage Generative Adversarial Networks (GAN) to produce synthetic free-text medical data with low re-identification risk, and apply these to replicate machine learning solutions. We trained GAN models to generate free-text cancer pathology reports. Decision models were trained using synthetic datasets reported performance metrics that were statistically similar to models trained using original test data. Our results further the use of GANs to generate synthetic data for collaborative research and re-use of machine learning models.


Subject(s)
Machine Learning , Biomedical Research
9.
J Med Internet Res ; 21(7): e13809, 2019 07 22.
Article in English | MEDLINE | ID: mdl-31333196

ABSTRACT

BACKGROUND: As the most commonly occurring form of mental illness worldwide, depression poses significant health and economic burdens to both the individual and community. Different types of depression pose different levels of risk. Individuals who suffer from mild forms of depression may recover without any assistance or be effectively managed by primary care or family practitioners. However, other forms of depression are far more severe and require advanced care by certified mental health providers. However, identifying cases of depression that require advanced care may be challenging to primary care providers and health care team members whose skill sets run broad rather than deep. OBJECTIVE: This study aimed to leverage a comprehensive range of patient-level diagnostic, behavioral, and demographic data, as well as past visit history data from a statewide health information exchange to build decision models capable of predicting the need of advanced care for depression across patients presenting at Eskenazi Health, the public safety net health system for Marion County, Indianapolis, Indiana. METHODS: Patient-level diagnostic, behavioral, demographic, and past visit history data extracted from structured datasets were merged with outcome variables extracted from unstructured free-text datasets and were used to train random forest decision models that predicted the need of advanced care for depression across (1) the overall patient population and (2) various subsets of patients at higher risk for depression-related adverse events; patients with a past diagnosis of depression; patients with a Charlson comorbidity index of ≥1; patients with a Charlson comorbidity index of ≥2; and all unique patients identified across the 3 above-mentioned high-risk groups. RESULTS: The overall patient population consisted of 84,317 adult (aged ≥18 years) patients. A total of 6992 (8.29%) of these patients were in need of advanced care for depression. Decision models for high-risk patient groups yielded area under the curve (AUC) scores between 86.31% and 94.43%. The decision model for the overall patient population yielded a comparatively lower AUC score of 78.87%. The variance of optimal sensitivity and specificity for all decision models, as identified using Youden J Index, is as follows: sensitivity=68.79% to 83.91% and specificity=76.03% to 92.18%. CONCLUSIONS: This study demonstrates the ability to automate screening for patients in need of advanced care for depression across (1) an overall patient population or (2) various high-risk patient groups using structured datasets covering acute and chronic conditions, patient demographics, behaviors, and past visit history. Furthermore, these results show considerable potential to enable preventative care and can be easily integrated into existing clinical workflows to improve access to wraparound health care services.


Subject(s)
Delivery of Health Care/methods , Depression/therapy , Health Information Exchange/standards , Machine Learning/standards , Adolescent , Adult , Female , Humans , Male , Middle Aged
10.
AMIA Jt Summits Transl Sci Proc ; 2019: 639-647, 2019.
Article in English | MEDLINE | ID: mdl-31259019

ABSTRACT

Patient matching is essential to minimize fragmentation of patient data. Existing patient matching efforts often do not account for nickname use. We sought to develop decision models that could identify true nicknames using features representing the phonetical and structural similarity of nickname pairs. We identified potential male and female name pairs from the Indiana Network for Patient Care (INPC), and developed a series of features that represented their phonetical and structural similarities. Next, we used the XGBoost classifier and hyperparameter tuning to build decision models to identify nicknames using these feature sets and a manually reviewed gold standard. Decision models reported high precision/positive predictive value and accuracy scores for both male and female name pairs despite the low number of true nickname matches in the datasets under study. Ours is one of the first efforts to identify patient nicknames using machine learning approaches.

11.
Am J Prev Med ; 56(4): e125-e133, 2019 04.
Article in English | MEDLINE | ID: mdl-30772150

ABSTRACT

INTRODUCTION: Social determinants of health are critical drivers of health status and cost, but are infrequently screened or addressed in primary care settings. Systematic approaches to identifying individuals with unmet social determinants needs could better support practice workflows and linkages of patients to services. A pilot study examined the effect of a risk-stratification tool on referrals to services that address social determinants in an urban safety-net population. METHODS: An intervention that risk stratified patients according to the need for wraparound was evaluated in a stepped wedge design (i.e., phased implementation at the clinic level during 2017). Staff at nine federally qualified health centers received a daily report predicting patients' needs for social worker, dietitian, behavioral health, and other wraparound services (categorized as low, rising, or high risk). Outcomes included referrals and uptake of appointments to wraparound services. RESULTS: Among 238,087 encounters, providing clinic staff with risk-stratification scores increased the odds that a patient would be referred to a social worker. For patients categorized as high risk, the odds of a social work referral was 65% higher than controls and similar patients, but lower effect sizes were observed for individuals categorized with rising and low risk. Among referred patients, the intervention was generally associated with increased odds of kept appointments. CONCLUSIONS: This study provided preliminary evidence that risk-stratification interventions to identify patients in need of wraparound services to address social determinants can increase referrals and uptake of services that may address social drivers of disease burden.


Subject(s)
Health Status , Patient Acceptance of Health Care/statistics & numerical data , Preventive Health Services/statistics & numerical data , Safety-net Providers/statistics & numerical data , Social Determinants of Health , Adult , Female , Humans , Indiana , Male , Middle Aged , Pilot Projects , Preventive Health Services/organization & administration , Referral and Consultation/statistics & numerical data , Risk Assessment , Safety-net Providers/organization & administration , Social Work/organization & administration , Social Work/statistics & numerical data , Urban Health Services/organization & administration , Urban Health Services/statistics & numerical data
13.
J Am Med Inform Assoc ; 25(1): 47-53, 2018 01 01.
Article in English | MEDLINE | ID: mdl-29177457

ABSTRACT

Introduction: A growing variety of diverse data sources is emerging to better inform health care delivery and health outcomes. We sought to evaluate the capacity for clinical, socioeconomic, and public health data sources to predict the need for various social service referrals among patients at a safety-net hospital. Materials and Methods: We integrated patient clinical data and community-level data representing patients' social determinants of health (SDH) obtained from multiple sources to build random forest decision models to predict the need for any, mental health, dietitian, social work, or other SDH service referrals. To assess the impact of SDH on improving performance, we built separate decision models using clinical and SDH determinants and clinical data only. Results: Decision models predicting the need for any, mental health, and dietitian referrals yielded sensitivity, specificity, and accuracy measures ranging between 60% and 75%. Specificity and accuracy scores for social work and other SDH services ranged between 67% and 77%, while sensitivity scores were between 50% and 63%. Area under the receiver operating characteristic curve values for the decision models ranged between 70% and 78%. Models for predicting the need for any services reported positive predictive values between 65% and 73%. Positive predictive values for predicting individual outcomes were below 40%. Discussion: The need for various social service referrals can be predicted with considerable accuracy using a wide range of readily available clinical and community data that measure socioeconomic and public health conditions. While the use of SDH did not result in significant performance improvements, our approach represents a novel and important application of risk predictive modeling.


Subject(s)
Decision Support Techniques , Primary Health Care , Safety-net Providers , Social Determinants of Health , Social Work , Adult , Female , Humans , Indiana , Male , ROC Curve , Sensitivity and Specificity , Supervised Machine Learning
14.
J Biomed Inform ; 69: 160-176, 2017 05.
Article in English | MEDLINE | ID: mdl-28410983

ABSTRACT

OBJECTIVES: Existing approaches to derive decision models from plaintext clinical data frequently depend on medical dictionaries as the sources of potential features. Prior research suggests that decision models developed using non-dictionary based feature sourcing approaches and "off the shelf" tools could predict cancer with performance metrics between 80% and 90%. We sought to compare non-dictionary based models to models built using features derived from medical dictionaries. MATERIALS AND METHODS: We evaluated the detection of cancer cases from free text pathology reports using decision models built with combinations of dictionary or non-dictionary based feature sourcing approaches, 4 feature subset sizes, and 5 classification algorithms. Each decision model was evaluated using the following performance metrics: sensitivity, specificity, accuracy, positive predictive value, and area under the receiver operating characteristics (ROC) curve. RESULTS: Decision models parameterized using dictionary and non-dictionary feature sourcing approaches produced performance metrics between 70 and 90%. The source of features and feature subset size had no impact on the performance of a decision model. CONCLUSION: Our study suggests there is little value in leveraging medical dictionaries for extracting features for decision model building. Decision models built using features extracted from the plaintext reports themselves achieve comparable results to those built using medical dictionaries. Overall, this suggests that existing "off the shelf" approaches can be leveraged to perform accurate cancer detection using less complex Named Entity Recognition (NER) based feature extraction, automated feature selection and modeling approaches.


Subject(s)
Algorithms , Dictionaries, Medical as Topic , Neoplasms/diagnosis , Automation , Electronic Health Records , Humans , Public Health , ROC Curve
15.
Appl Clin Inform ; 8(1): 108-121, 2017 Feb 01.
Article in English | MEDLINE | ID: mdl-28144679

ABSTRACT

OBJECTIVES: Despite significant awareness on the value of leveraging patient relationships across the healthcare continuum, there is no research on the potential of using Electronic Health Record (EHR) systems to store structured patient relationship data, or its impact on enabling better healthcare. We sought to identify which EHR systems supported effective patient relationship data collection, and for systems that do, what types of relationship data is collected, how this data is used, and the perceived value of doing so. MATERIALS AND METHODS: We performed a literature search to identify EHR systems that supported patient relationship data collection. Based on our results, we defined attributes of an effective patient relationship model. The Open Medical Record System (OpenMRS), an open source medical record platform for underserved settings met our eligibility criteria for effective patient relationship collection. We performed a survey to understand how the OpenMRS patient relationship model was used, and how it brought value to implementers. RESULTS: The OpenMRS patient relationship model has won widespread adoption across many implementations and is perceived to be valuable in enabling better health care delivery. Patient relationship information is widely used for community health programs and enabling chronic care. Additionally, many OpenMRS implementers were using this feature to collect custom relationship types for implementation specific needs. CONCLUSIONS: We believe that flexible patient relationship data collection is critical for better healthcare, and can inform community care and chronic care initiatives across the world. Additionally, patient relationship data could also be leveraged for many other initiatives such as patient centric care and in the field of precision medicine.


Subject(s)
Delivery of Health Care , Electronic Health Records , Interpersonal Relations , Data Collection , Humans
16.
AMIA Annu Symp Proc ; 2017: 1034-1043, 2017.
Article in English | MEDLINE | ID: mdl-29854171

ABSTRACT

Despite unprecedented spending, US maternal outcomes have worsened drastically over the past decade. In comparison, maternal outcomes of many Low and Middle-Income Countries (LMIC) have improved. Lessons learnt by their success may be applicable to the US. We performed a literature review to identify innovations that had met with success across LMIC, and should be considered for adoption in the US. mHealth and patient facing alerts, Telehealth, patient controlled health records, inclusion of patient relationship data in health information systems and positioning empowered community health workers as catalysts of maternal care delivery were identified as innovations worthy of further evaluation. These innovations were categorized into several themes; knowledge, technology, patient/community empowerment, coordination and process change. Tools that place informed and empowered patients and community members at the center of maternal care has greatly improved maternal outcomes, and are suitable to be considered for the US healthcare system.


Subject(s)
Health Information Systems , Maternal Health Services/organization & administration , Maternal Health , Delivery of Health Care , Electronic Health Records , Female , Global Health , Humans , Maternal Health Services/standards , Medical Informatics , Pregnancy , Telemedicine , United States
17.
Stud Health Technol Inform ; 245: 442-446, 2017.
Article in English | MEDLINE | ID: mdl-29295133

ABSTRACT

Recent focus on Precision medicine (PM) has led to a flurry of research activities across the developed world. But how can understaffed and underfunded health care systems in the US and elsewhere evolve to adapt PM to address pressing healthcare needs? We offer guidance on a wide range of sources of healthcare data / knowledge as well as other infrastructure / tools that could inform PM initiatives, and may serve as low hanging fruit easily adapted on the incremental pathway towards a PM based healthcare system. Using these resources and tools, we propose an incremental adoption pathway to inform implementers working in underserved communities around the world on how they should position themselves to gradually embrace the concepts of PM with minimal interruption to existing care delivery.


Subject(s)
Delivery of Health Care , Precision Medicine , Confidentiality , Humans
18.
Stud Health Technol Inform ; 245: 1354, 2017.
Article in English | MEDLINE | ID: mdl-29295433

ABSTRACT

Chronic care coordination efforts often focus on the needs of the healthcare team and not on the individual needs of each patient. However, developing a personalized care plan for patients with Chronic Kidney Disease (CKD) requires individual patient engagement with the health care team. We describe the development of a CKD e-care plan that focuses on patient specific needs and life goals, and can be personalized according to provider needs.


Subject(s)
Patient Care Team , Patient Participation , Renal Insufficiency, Chronic , Telemedicine , Humans
19.
J Biomed Inform ; 60: 145-52, 2016 Apr.
Article in English | MEDLINE | ID: mdl-26826453

ABSTRACT

OBJECTIVES: Increased adoption of electronic health records has resulted in increased availability of free text clinical data for secondary use. A variety of approaches to obtain actionable information from unstructured free text data exist. These approaches are resource intensive, inherently complex and rely on structured clinical data and dictionary-based approaches. We sought to evaluate the potential to obtain actionable information from free text pathology reports using routinely available tools and approaches that do not depend on dictionary-based approaches. MATERIALS AND METHODS: We obtained pathology reports from a large health information exchange and evaluated the capacity to detect cancer cases from these reports using 3 non-dictionary feature selection approaches, 4 feature subset sizes, and 5 clinical decision models: simple logistic regression, naïve bayes, k-nearest neighbor, random forest, and J48 decision tree. The performance of each decision model was evaluated using sensitivity, specificity, accuracy, positive predictive value, and area under the receiver operating characteristics (ROC) curve. RESULTS: Decision models parameterized using automated, informed, and manual feature selection approaches yielded similar results. Furthermore, non-dictionary classification approaches identified cancer cases present in free text reports with evaluation measures approaching and exceeding 80-90% for most metrics. CONCLUSION: Our methods are feasible and practical approaches for extracting substantial information value from free text medical data, and the results suggest that these methods can perform on par, if not better, than existing dictionary-based approaches. Given that public health agencies are often under-resourced and lack the technical capacity for more complex methodologies, these results represent potentially significant value to the public health field.


Subject(s)
Decision Support Techniques , Information Storage and Retrieval , Medical Informatics , Neoplasms/epidemiology , Algorithms , Area Under Curve , Bayes Theorem , Electronic Health Records , Humans , Logistic Models , Predictive Value of Tests , Public Health , ROC Curve , Sensitivity and Specificity
20.
J Med Syst ; 39(11): 182, 2015 Nov.
Article in English | MEDLINE | ID: mdl-26446013

ABSTRACT

We sought to enable better interoperability and easy adoption of healthcare applications by developing a standardized domain independent Application Programming Interface (API) for an Electronic Medical Record (EMR) system. We leveraged the modular architecture of the Open Medical Record System (OpenMRS) to build a Fast Healthcare Interoperability Resources (FHIR) based add-on module that could consume FHIR resources and requests made on OpenMRS. The OpenMRS FHIR module supports a subset of FHIR resources that could be used to interact with clinical data persisted in OpenMRS. We demonstrate the ease of connecting healthcare applications using the FHIR API by integrating a third party Substitutable Medical Apps & Reusable Technology (SMART) application with OpenMRS via FHIR. The OpenMRS FHIR module is an optional component of the OpenMRS platform. The FHIR API significantly reduces the effort required to implement OpenMRS by preventing developers from having to learn or work with a domain specific OpenMRS API. We propose an integration pathway where the domain specific legacy OpenMRS API is gradually retired in favor of the new FHIR API, which would be integrated into the core OpenMRS platform. Our efforts indicate that a domain independent API is a reality for any EMR system. These efforts demonstrate the adoption of an emerging FHIR standard that is seen as a replacement for both Health Level 7 (HL7) Version 2 and Version 3. We propose a gradual integration approach where our FHIR API becomes the preferred method for communicating with the OpenMRS platform.


Subject(s)
Electronic Health Records/standards , Health Information Exchange/standards , Systems Integration , Health Level Seven , Humans , Mobile Applications
SELECTION OF CITATIONS
SEARCH DETAIL
...