Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 15 de 15
Filter
1.
Appl Clin Inform ; 13(3): 700-710, 2022 05.
Article in English | MEDLINE | ID: mdl-35644141

ABSTRACT

BACKGROUND: Emergency department (ED)-based injury surveillance systems across many countries face resourcing challenges related to manual validation and coding of data. OBJECTIVE: This study describes the evaluation of a machine learning (ML)-based decision support tool (DST) to assist injury surveillance departments in the validation, coding, and use of their data, comparing outcomes in coding time, and accuracy pre- and postimplementations. METHODS: Manually coded injury surveillance data have been used to develop, train, and iteratively refine a ML-based classifier to enable semiautomated coding of injury narrative data. This paper describes a trial implementation of the ML-based DST in the Queensland Injury Surveillance Unit (QISU) workflow using a major pediatric hospital's ED data comparing outcomes in coding time and pre- and postimplementation accuracies. RESULTS: The study found a 10% reduction in manual coding time after the DST was introduced. The Kappa statistics analysis in both DST-assisted and -unassisted data shows increase in accuracy across three data fields, that is, injury intent (85.4% unassisted vs. 94.5% assisted), external cause (88.8% unassisted vs. 91.8% assisted), and injury factor (89.3% unassisted vs. 92.9% assisted). The classifier was also used to produce a timely report monitoring injury patterns during the novel coronavirus disease 2019 (COVID-19) pandemic. Hence, it has the potential for near real-time surveillance of emerging hazards to inform public health responses. CONCLUSION: The integration of the DST into the injury surveillance workflow shows benefits as it facilitates timely reporting and acts as a DST in the manual coding process.


Subject(s)
COVID-19 , Emergency Service, Hospital , Hospital Information Systems , Wounds and Injuries , COVID-19/epidemiology , Child , Hospital Information Systems/organization & administration , Humans , Injury Severity Score , Machine Learning , Pandemics , Workflow , Wounds and Injuries/classification
2.
Risk Anal ; 40(7): 1342-1354, 2020 07.
Article in English | MEDLINE | ID: mdl-32339316

ABSTRACT

This study aimed to use healthcare professionals' assessments to calculate expected risk of intravenous (IV) infusion harm for simulated high-risk medications that exceed soft limits and to investigate the impact of relevant risk factors. We designed 30 infusion scenarios for four high-risk medications, propofol, morphine, insulin, and heparin, infused in adult intensive care unit (AICU) and adult medical and surgical care unit (AMSU). A total of 20 pharmacists and 5 nurses provided their assessed expected risk of harm in each scenario. Descriptive statistics, analysis of variance with least square mean, and post hoc test were conducted to test the effects of field limit type, soft (SoftMax), and hard maximum drug limit types (HardMax), and care area-medication combination on risk of harm. The results showed that overdosing scenarios with continuous and bolus dose limit types were assessed with significantly higher risks than those of bolus dose rate type. An overdose infusion in AICU over a large SoftMax was assessed to be of higher risk than over a small one, but not in AMSU. For overdose infusions with three levels of drug amount, greater drug amount in AICU and AMSU was assessed to have higher risk, except insignificant risk difference between the infusions with higher and moderate drug amount in AMSU. This study obtained expected risk for simulated high-risk IV infusions and found that different field limit and SoftMax types can affect expected risk based on healthcare professionals' perspectives. The findings will be regarded as benchmarks for validating risk quantification models in future research.

3.
Accid Anal Prev ; 110: 115-127, 2018 Jan.
Article in English | MEDLINE | ID: mdl-29127808

ABSTRACT

INTRODUCTION: Classical Machine Learning (ML) models have been found to assign the external-cause-of-injury codes (E-codes) based on injury narratives with good overall accuracy but often struggle with rare categories, primarily due to lack of enough training cases and heavily skewed nature of injurdata. In this paper, we have: a) studied the effect of increasing the size of training data on the prediction performance of three classical ML models: Multinomial Naïve Bayes (MNB), Support Vector Machine (SVM) and Logistic Regression (LR), and b) studied the effect of filtering based on prediction strength of LR model when the model is trained on very-small (10,000 cases) and very-large (450,000 cases) training sets. METHOD: Data from Queensland Injury Surveillance Unit from years 2002-2012, which was categorized into 20 broad E-codes was used for this study. Eleven randomly chosen training sets of size ranging from 10,000 to 450,000 cases were used to train the ML models, and the prediction performance was analyzed on a prediction set of 50,150 cases. Filtering approach was tested on LR models trained on smallest and largest training datasets. Sensitivity was used as the performance measure for individual categories. Weighted average sensitivity (WAvg) and Unweighted average sensitivity (UAvg) were used as the measures of overall performance. Filtering approach was also tested for estimating category counts and was compared with approaches of summing prediction probabilities and counting direct predictions by ML model. RESULTS: The overall performance of all three ML models improved with increase in the size of training data. The overall sensitivities with maximum training size for LR and SVM models were similar (∼82%), and higher than MNB (76%). For all the ML models, the sensitivities of rare categories improved with increasing training data but they were considerably less than sensitivities of larger categories. With increasing training data size, LR and SVM exhibited diminishing improvement in UAvg whereas the improvement was relatively steady in case of MNB. Filtering based on prediction strength of LR model (and manual review of filtered cases) helped in improving the sensitivities of rare categories. A sizeable portion of cases still needed to be filtered even when the LR model was trained on very large training set. For estimating category counts, filtering approach provided best estimates for most E-codes and summing prediction probabilities approach provided better estimates for rare categories. CONCLUSIONS: Increasing the size of training data alone cannot solve the problem of poor classification performance on rare categories by ML models. Filtering could be an effective strategy to improve classification performance of rare categories when large training data is not available.


Subject(s)
Emergency Service, Hospital , Wounds and Injuries/classification , Bayes Theorem , Humans , Logistic Models , Machine Learning , Queensland , Support Vector Machine
4.
Accid Anal Prev ; 98: 359-371, 2017 Jan.
Article in English | MEDLINE | ID: mdl-27863339

ABSTRACT

Injury narratives are now available real time and include useful information for injury surveillance and prevention. However, manual classification of the cause or events leading to injury found in large batches of narratives, such as workers compensation claims databases, can be prohibitive. In this study we compare the utility of four machine learning algorithms (Naïve Bayes, Single word and Bi-gram models, Support Vector Machine and Logistic Regression) for classifying narratives into Bureau of Labor Statistics Occupational Injury and Illness event leading to injury classifications for a large workers compensation database. These algorithms are known to do well classifying narrative text and are fairly easy to implement with off-the-shelf software packages such as Python. We propose human-machine learning ensemble approaches which maximize the power and accuracy of the algorithms for machine-assigned codes and allow for strategic filtering of rare, emerging or ambiguous narratives for manual review. We compare human-machine approaches based on filtering on the prediction strength of the classifier vs. agreement between algorithms. Regularized Logistic Regression (LR) was the best performing algorithm alone. Using this algorithm and filtering out the bottom 30% of predictions for manual review resulted in high accuracy (overall sensitivity/positive predictive value of 0.89) of the final machine-human coded dataset. The best pairings of algorithms included Naïve Bayes with Support Vector Machine whereby the triple ensemble NBSW=NBBI-GRAM=SVM had very high performance (0.93 overall sensitivity/positive predictive value and high accuracy (i.e. high sensitivity and positive predictive values)) across both large and small categories leaving 41% of the narratives for manual review. Integrating LR into this ensemble mix improved performance only slightly. For large administrative datasets we propose incorporation of methods based on human-machine pairings such as we have done here, utilizing readily-available off-the-shelf machine learning techniques and resulting in only a fraction of narratives that require manual review. Human-machine ensemble methods are likely to improve performance over total manual coding.


Subject(s)
Accidents, Occupational/statistics & numerical data , Algorithms , Databases, Factual/statistics & numerical data , Public Health Surveillance/methods , Wounds and Injuries/epidemiology , Bayes Theorem , Clinical Coding/methods , Humans , Logistic Models , Machine Learning , Models, Theoretical , Narration , Workers' Compensation/statistics & numerical data
5.
J Safety Res ; 57: 71-82, 2016 06.
Article in English | MEDLINE | ID: mdl-27178082

ABSTRACT

INTRODUCTION: Studies on autocoding injury data have found that machine learning algorithms perform well for categories that occur frequently but often struggle with rare categories. Therefore, manual coding, although resource-intensive, cannot be eliminated. We propose a Bayesian decision support system to autocode a large portion of the data, filter cases for manual review, and assist human coders by presenting them top k prediction choices and a confusion matrix of predictions from Bayesian models. METHOD: We studied the prediction performance of Single-Word (SW) and Two-Word-Sequence (TW) Naïve Bayes models on a sample of data from the 2011 Survey of Occupational Injury and Illness (SOII). We used the agreement in prediction results of SW and TW models, and various prediction strength thresholds for autocoding and filtering cases for manual review. We also studied the sensitivity of the top k predictions of the SW model, TW model, and SW-TW combination, and then compared the accuracy of the manually assigned codes to SOII data with that of the proposed system. RESULTS: The accuracy of the proposed system, assuming well-trained coders reviewing a subset of only 26% of cases flagged for review, was estimated to be comparable (86.5%) to the accuracy of the original coding of the data set (range: 73%-86.8%). Overall, the TW model had higher sensitivity than the SW model, and the accuracy of the prediction results increased when the two models agreed, and for higher prediction strength thresholds. The sensitivity of the top five predictions was 93%. CONCLUSIONS: The proposed system seems promising for coding injury data as it offers comparable accuracy and less manual coding. PRACTICAL APPLICATIONS: Accurate and timely coded occupational injury data is useful for surveillance as well as prevention activities that aim to make workplaces safer.


Subject(s)
Clinical Coding/methods , Decision Support Techniques , Occupational Injuries/classification , Algorithms , Bayes Theorem , Humans , Models, Theoretical
6.
Inj Prev ; 22 Suppl 1: i34-42, 2016 Apr.
Article in English | MEDLINE | ID: mdl-26728004

ABSTRACT

OBJECTIVE: Vast amounts of injury narratives are collected daily and are available electronically in real time and have great potential for use in injury surveillance and evaluation. Machine learning algorithms have been developed to assist in identifying cases and classifying mechanisms leading to injury in a much timelier manner than is possible when relying on manual coding of narratives. The aim of this paper is to describe the background, growth, value, challenges and future directions of machine learning as applied to injury surveillance. METHODS: This paper reviews key aspects of machine learning using injury narratives, providing a case study to demonstrate an application to an established human-machine learning approach. RESULTS: The range of applications and utility of narrative text has increased greatly with advancements in computing techniques over time. Practical and feasible methods exist for semiautomatic classification of injury narratives which are accurate, efficient and meaningful. The human-machine learning approach described in the case study achieved high sensitivity and PPV and reduced the need for human coding to less than a third of cases in one large occupational injury database. CONCLUSIONS: The last 20 years have seen a dramatic change in the potential for technological advancements in injury surveillance. Machine learning of 'big injury narrative data' opens up many possibilities for expanded sources of data which can provide more comprehensive, ongoing and timely surveillance to inform future injury prevention policy and practice.


Subject(s)
Accidents, Occupational/classification , Data Mining/methods , Machine Learning , Occupational Injuries/classification , Population Surveillance/methods , Databases, Factual , Humans , Models, Theoretical
7.
Accid Anal Prev ; 84: 165-76, 2015 Nov.
Article in English | MEDLINE | ID: mdl-26412196

ABSTRACT

Public health surveillance programs in the U.S. are undergoing landmark changes with the availability of electronic health records and advancements in information technology. Injury narratives gathered from hospital records, workers compensation claims or national surveys can be very useful for identifying antecedents to injury or emerging risks. However, classifying narratives manually can become prohibitive for large datasets. The purpose of this study was to develop a human-machine system that could be relatively easily tailored to routinely and accurately classify injury narratives from large administrative databases such as workers compensation. We used a semi-automated approach based on two Naïve Bayesian algorithms to classify 15,000 workers compensation narratives into two-digit Bureau of Labor Statistics (BLS) event (leading to injury) codes. Narratives were filtered out for manual review if the algorithms disagreed or made weak predictions. This approach resulted in an overall accuracy of 87%, with consistently high positive predictive values across all two-digit BLS event categories including the very small categories (e.g., exposure to noise, needle sticks). The Naïve Bayes algorithms were able to identify and accurately machine code most narratives leaving only 32% (4853) for manual review. This strategy substantially reduces the need for resources compared with manual review alone.


Subject(s)
Accidents, Occupational/statistics & numerical data , Databases, Factual/statistics & numerical data , Public Health Surveillance/methods , Workers' Compensation/statistics & numerical data , Wounds and Injuries/epidemiology , Adult , Aged , Algorithms , Bayes Theorem , Clinical Coding , Female , Humans , Incidence , Male , Middle Aged , Narration , Prevalence , Reproducibility of Results , United States/epidemiology
8.
Accid Anal Prev ; 62: 119-29, 2014 Jan.
Article in English | MEDLINE | ID: mdl-24144497

ABSTRACT

BACKGROUND: In occupational safety research, narrative text analysis has been combined with coded surveillance, data to improve identification and understanding of injuries and their circumstances. Injury data give, information about incidence and the direct cause of an injury, while near-miss data enable the, identification of various hazards within an organization or industry. Further, near-miss data provide an, opportunity for surveillance and risk reduction. The National Firefighter Near-Miss Reporting System, (NFFNMRS) is a voluntary reporting system that collects narrative text data on near-miss and injurious, events within the fire and emergency services industry. In recent research, autocoding techniques, using Bayesian models have been used to categorize/code injury narratives with up to 90% accuracy, thereby reducing the amount of human effort required to manually code large datasets. Autocoding, techniques have not yet been applied to near-miss narrative data. METHODS: We manually assigned mechanism of injury codes to previously un-coded narratives from the, NFFNMRS and used this as a training set to develop two Bayesian autocoding models, Fuzzy and Naïve. We calculated sensitivity, specificity and positive predictive value for both models. We also evaluated, the effect of training set size on prediction sensitivity and compared the models' predictive ability as, related to injury outcome. We cross-validated a subset of the prediction set for accuracy of the model, predictions. RESULTS: Overall, the Fuzzy model performed better than Naïve, with a sensitivity of 0.74 compared to 0.678., Where Fuzzy and Naïve shared the same prediction, the cross-validation showed a sensitivity of 0.602., As the number of records in the training set increased, the models performed at a higher sensitivity, suggesting that both the Fuzzy and Naïve models were essentially "learning". Injury records were, predicted with greater sensitivity than near-miss records. CONCLUSION: We conclude that the application of Bayesian autocoding methods can successfully code both near misses, and injuries in longer-than-average narratives with non-specific prompts regarding injury. Such, coding allowed for the creation of two new quantitative data elements for injury outcome and injury, mechanism.


Subject(s)
Accidents, Occupational/statistics & numerical data , Data Mining/methods , Emergency Medical Services , Firefighters/statistics & numerical data , Occupational Injuries/epidemiology , Bayes Theorem , Humans , Models, Statistical , Narration , Occupational Health , United States/epidemiology
9.
Appl Clin Inform ; 1(4): 466-85, 2010.
Article in English | MEDLINE | ID: mdl-23616855

ABSTRACT

OBJECTIVE: Computerized clinical reminder (CCR) systems can improve preventive service delivery by providing patient-specific reminders at the point of care. However, adherence varies between individual CCRs and is correlated to resolution time amongst other factors. This study aimed to evaluate how a proposed CCR redesign providing information explaining why the CCRs occurred would impact providers' prioritization of individual CCRs. DESIGN: Two CCR designs were prototyped to represent the original and the new design, respectively. The new CCR design incorporated a knowledge-based risk factor repository, a prioritization mechanism, and a role-based filter. Sixteen physicians participated in a controlled experiment to compare the use of the original and the new CCR systems. The subjects individually simulated a scenario-based patient encounter, followed by a semi-structured interview and survey. MEASUREMENTS: We collected and analyzed the order in which the CCRs were prioritized, the perceived usefulness of each design feature, and semi-structured interview data. RESULTS: We elicited the prioritization heuristics used by the physicians, and found a CCR system needed to be relevant, easy to resolve, and integrated with workflow. The redesign impacted 80% of physicians and 44% of prioritization decisions. Decisions were no longer correlated to resolution time given the new design. The proposed design features were rated useful or very useful. CONCLUSION: This study demonstrated that the redesign of a CCR system using a knowledge-based risk factor repository, a prioritization mechanism, and a role-based filter can impact clinicians' decision making. These features are expected to ultimately improve the quality of care and patient safety.

10.
AMIA Annu Symp Proc ; : 334-8, 2007 Oct 11.
Article in English | MEDLINE | ID: mdl-18693853

ABSTRACT

Electronic decision support systems are an important tool for improving performance and improving quality of care. We investigated the relationship between physicians' estimated resolution times for computerized clinical reminders and adherence rates in VA outpatient settings. We surveyed 10 expert physician users to assess the resolution times of four targeted CCRs for three cases: pessimistic (worst case), expected (average), and optimistic times (best case). ANOVA test shows that physicians' adherence rates for the four CCRs differed significantly (p = 0.01). CCR adherence rate and resolution time were highly linearly correlated (R-square= 0.876 for the best case, R-square= 0.997 for the average case, and R-square= 0.670 for the worst case). This study suggested that future efforts in designing CCRs need to take resolution time into consideration during design, usability testing and implementation phases.


Subject(s)
Decision Support Systems, Clinical , Guideline Adherence , Practice Guidelines as Topic , Reminder Systems , Analysis of Variance , Data Collection , Humans , Medical Records Systems, Computerized , Physicians , Time
11.
AMIA Annu Symp Proc ; : 1015, 2006.
Article in English | MEDLINE | ID: mdl-17238634

ABSTRACT

We are examining the workflow processes within a large, urban general internal medicine practice in order to understand task inefficiencies that can lead to medical errors. We are performing a time-motion study looking at task management of check-in, check-out clerks, nurses, nurse's aides and physicians. Our pilot data suggests that there is significant variability in the task burden at different times of the day due to several factors, including patient-show rates, time allotted to late arrivals, multi-tasking by check-in clerks and inefficient intra-clinic communication processes. Our initial data suggests that a streamlined check-in process, more effective communication strategies and better time-task utilization can improve patient flows in this clinic.


Subject(s)
Primary Health Care/organization & administration , Task Performance and Analysis , Ambulatory Care Facilities/organization & administration , Efficiency, Organizational , Humans , Medical Order Entry Systems , Urban Health Services
12.
Accid Anal Prev ; 36(2): 165-71, 2004 Mar.
Article in English | MEDLINE | ID: mdl-14642871

ABSTRACT

OBJECTIVE: To investigate the accuracy of a computerized method for classifying injury narratives into external-cause-of-injury and poisoning (E-code) categories. METHODS: This study used injury narratives and corresponding E-codes assigned by experts from the 1997 and 1998 US National Health Interview Survey (NHIS). A Fuzzy Bayesian model was used to assign injury descriptions to 13 E-code categories. Sensitivity, specificity and positive predictive value were measured by comparing the computer generated codes with E-code categories assigned by experts. RESULTS: The computer program correctly classified 4695 (82.7%) of the 5677 injury narratives when multiple words were included as keywords in the model. The use of multiple-word predictors compared with using single words alone improved both the sensitivity and specificity of the computer generated codes. The program is capable of identifying and filtering out cases that would benefit most from manual coding. For example, the program could be used to code the narrative if the maximum probability of a category given the keywords in the narrative was at least 0.9. If the maximum probability was lower than 0.9 (which will be the case for approximately 33% of the narratives) the case would be filtered out for manual review. CONCLUSIONS: A computer program based on Fuzzy Bayes logic is capable of accurately categorizing cause-of-injury codes from injury narratives. The capacity to filter out certain cases for manual coding improves the utility of this process.


Subject(s)
Forms and Records Control/methods , Medical Records Systems, Computerized , Trauma Severity Indices , Wounds and Injuries/classification , Fuzzy Logic , Health Surveys , Humans , Models, Theoretical , Predictive Value of Tests , Software , United States/epidemiology , Wounds and Injuries/epidemiology
13.
Ergonomics ; 46(1-3): 52-67, 2003 Jan 15.
Article in English | MEDLINE | ID: mdl-12554398

ABSTRACT

Customers using printers occasionally experience problems such as fuzzy images, bands, or streaks. The customer may call or otherwise contact the manufacturer, who attempts to diagnose the problem based on the customer's description of the problem. This study evaluated Bayesian inference as a tool for identifying or diagnosing 16 different types of print defects from such descriptions. The Bayesian model was trained using 1701 narrative descriptions of print defects obtained from 60 subjects with varying technical backgrounds. The Bayesian model was then implemented as an interactive decision support system, which was used by eight 'agents' to diagnose print defects reported by 16 'customers' in a simulated call centre. The 'agents' and 'customers' in the simulated call centre were all students at Purdue University. Each customer made eight telephone calls, resulting in a total of 128 telephone calls in which the customer reported defects to the agents. The results showed that the Bayesian model closely fitted the data in the training set of narratives. Overall, the model correctly predicted the actual defect category with its top prediction 70% of the time. The actual defect was in the top five predictions 94% of the time. The model in the simulated call centre performed nearly as well for the test subjects. The top prediction was correct 50% of the time, and the defect was one of the top five predictions 80% of the time. Agent accuracy in diagnosing the problem improved when using the tool. These results demonstrated that the Bayesian system learned enough from the existing narratives to accurately classify print defect categories.


Subject(s)
Computer Peripherals/standards , Decision Support Techniques , Ergonomics , Information Centers/organization & administration , Information Centers/statistics & numerical data , Printing/standards , Adolescent , Adult , Bayes Theorem , Color , Computer Simulation , Consumer Behavior , Humans , Indiana , Quality Control , Universities
14.
Accid Anal Prev ; 34(6): 793-805, 2002 Nov.
Article in English | MEDLINE | ID: mdl-12371784

ABSTRACT

Past research in safety belt use has primarily focused on describing the relationship between drivers' demographic characteristics and safety belt use. This study compared the impact of situational factors (the direction of collision, the type of road, and the presence of an airbag system), demographic factors, and constructs (criteria) elicited from subjects regarding safety belt use. Based on the results obtained, a conceptual model was developed. The model indicated that drivers' decision-making process when judging the level of accident risk and usefulness of safety belts differs from those that determine actual behavior. Perceived risk was related to road type, perceived consequences of an accident, perceived usefulness of safety belts, self responsibility, the time available for the driver to warn the other driver, dangerous behavior, and gender. These variables showed that people were able to rationally judge the risk. Despite the fact that people judge behavior in what appeared to be a rational manner, risk perception was not a good predictor of belt use. Belt use was mainly influenced by individual factors such as gender, grade point average (GPA), and age. Other factors impacting safety belt use included the perceived frequency of an accident and the S.D. of perceived usefulness of safety belts.


Subject(s)
Attitude , Automobile Driving/psychology , Choice Behavior , Seat Belts/statistics & numerical data , Adolescent , Adult , Age Factors , Analysis of Variance , Female , Humans , Illinois , Male , Regression Analysis , Sex Factors
15.
Int J Occup Saf Ergon ; 1(3): 215-234, 1995 Jan.
Article in English | MEDLINE | ID: mdl-10603554

ABSTRACT

A distributed signal detection theory model is employed to analyze the effectiveness of warnings under different operating conditions. In particular, the following two cases are examined: (a) the warning on a product is always present and (b) the warning on a product is administered selectively. The comparative effects of warning versus no warning are described. It is established that selectivity always increases effectiveness. The implications to optimal warning design of intermittent hazard versus continuous hazard are discussed. Furthermore, a series of experiments is conducted to compare the behavior of human participants with the prescriptive behavior of the normative model. The changes in the behavior of the human participants response to changes in the warning levels are consistent with the predictions of the model. These changes should be taken into consideration in the design of warnings.

SELECTION OF CITATIONS
SEARCH DETAIL
...