Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 7 de 7
Filter
Add more filters










Database
Language
Publication year range
1.
EGEMS (Wash DC) ; 4(1): 1254, 2016.
Article in English | MEDLINE | ID: mdl-27668266

ABSTRACT

INTRODUCTION: The incidence of incidentally detected lung nodules is rapidly rising, but little is known about their management or associated patient outcomes. One barrier to studying lung nodule care is the inability to efficiently and reliably identify the cohort of interest (i.e. cases). Investigators at Kaiser Permanente Southern California (KPSC) recently developed an automated method to identify individuals with an incidentally discovered lung nodule, but the feasibility of implementing this method across other health systems is unknown. METHODS: A random sample of Group Health (GH) members who had a computed tomography in 2012 underwent chart review to determine if a lung nodule was documented in the radiology report. A previously developed natural language processing (NLP) algorithm was implemented at our site using only knowledge of the key words, qualifiers, excluding terms, and the logic linking these parameters. RESULTS: Among 499 subjects, 156 (31%, 95% confidence interval [CI] 27-36%) had an incidentally detected lung nodule. NLP identified 189 (38%, 95% CI 33-42%) individuals with a nodule. The accuracy of NLP at GH was similar to its accuracy at KPSC: sensitivity 90% (95% CI 85-95%) and specificity 86% (95% CI 82-89%) versus sensitivity 96% (95% CI 88-100%) and specificity 86% (95% CI 75-94%). CONCLUSION: Automated methods designed to identify individuals with an incidentally detected lung nodule can feasibly and independently be implemented across health systems. Use of these methods will likely facilitate the efficient conduct of multi-site studies evaluating practice patterns and associated outcomes.

2.
PLoS One ; 9(11): e112774, 2014.
Article in English | MEDLINE | ID: mdl-25393544

ABSTRACT

A review of published work in clinical natural language processing (NLP) may suggest that the negation detection task has been "solved." This work proposes that an optimizable solution does not equal a generalizable solution. We introduce a new machine learning-based Polarity Module for detecting negation in clinical text, and extensively compare its performance across domains. Using four manually annotated corpora of clinical text, we show that negation detection performance suffers when there is no in-domain development (for manual methods) or training data (for machine learning-based methods). Various factors (e.g., annotation guidelines, named entity characteristics, the amount of data, and lexical and syntactic context) play a role in making generalizability difficult, but none completely explains the phenomenon. Furthermore, generalizability remains challenging because it is unclear whether to use a single source for accurate data, combine all sources into a single model, or apply domain adaptation methods. The most reliable means to improve negation detection is to manually annotate in-domain training data (or, perhaps, manually modify rules); this is a strategy for optimizing performance, rather than generalizing it. These results suggest a direction for future work in domain-adaptive and task-adaptive methods for clinical NLP.


Subject(s)
Algorithms , Artificial Intelligence/statistics & numerical data , Natural Language Processing , Clinical Medicine/education , Humans , Semantics , Textbooks as Topic , Vocabulary, Controlled
3.
J Am Med Inform Assoc ; 21(5): 858-65, 2014.
Article in English | MEDLINE | ID: mdl-24637954

ABSTRACT

OBJECTIVE: We developed the Medication Extraction and Normalization (MedXN) system to extract comprehensive medication information and normalize it to the most appropriate RxNorm concept unique identifier (RxCUI) as specifically as possible. METHODS: Medication descriptions in clinical notes were decomposed into medication name and attributes, which were separately extracted using RxNorm dictionary lookup and regular expression. Then, each medication name and its attributes were combined together according to RxNorm convention to find the most appropriate RxNorm representation. To do this, we employed serialized hierarchical steps implemented in Apache's Unstructured Information Management Architecture. We also performed synonym expansion, removed false medications, and employed inference rules to improve the medication extraction and normalization performance. RESULTS: An evaluation on test data of 397 medication mentions showed F-measures of 0.975 for medication name and over 0.90 for most attributes. The RxCUI assignment produced F-measures of 0.932 for medication name and 0.864 for full medication information. Most false negative RxCUI assignments in full medication information are due to human assumption of missing attributes and medication names in the gold standard. CONCLUSIONS: The MedXN system (http://sourceforge.net/projects/ohnlp/files/MedXN/) was able to extract comprehensive medication information with high accuracy and demonstrated good normalization capability to RxCUI as long as explicit evidence existed. More sophisticated inference rules might result in further improvements to specific RxCUI assignments for incomplete medication descriptions.


Subject(s)
Data Mining/methods , Electronic Health Records , Natural Language Processing , Pharmaceutical Preparations , RxNorm , Drug Therapy , Humans
5.
Am J Epidemiol ; 179(6): 749-58, 2014 Mar 15.
Article in English | MEDLINE | ID: mdl-24488511

ABSTRACT

The increasing availability of electronic health records (EHRs) creates opportunities for automated extraction of information from clinical text. We hypothesized that natural language processing (NLP) could substantially reduce the burden of manual abstraction in studies examining outcomes, like cancer recurrence, that are documented in unstructured clinical text, such as progress notes, radiology reports, and pathology reports. We developed an NLP-based system using open-source software to process electronic clinical notes from 1995 to 2012 for women with early-stage incident breast cancers to identify whether and when recurrences were diagnosed. We developed and evaluated the system using clinical notes from 1,472 patients receiving EHR-documented care in an integrated health care system in the Pacific Northwest. A separate study provided the patient-level reference standard for recurrence status and date. The NLP-based system correctly identified 92% of recurrences and estimated diagnosis dates within 30 days for 88% of these. Specificity was 96%. The NLP-based system overlooked 5 of 65 recurrences, 4 because electronic documents were unavailable. The NLP-based system identified 5 other recurrences incorrectly classified as nonrecurrent in the reference standard. If used in similar cohorts, NLP could reduce by 90% the number of EHR charts abstracted to identify confirmed breast cancer recurrence cases at a rate comparable to traditional abstraction.


Subject(s)
Breast Neoplasms/diagnosis , Electronic Health Records/statistics & numerical data , Natural Language Processing , Neoplasm Recurrence, Local/diagnosis , Age Factors , Aged , Breast Neoplasms/physiopathology , Breast Neoplasms/therapy , Female , Humans , Middle Aged , Neoplasm Grading , Neoplasm Recurrence, Local/physiopathology , Neoplasm Recurrence, Local/therapy , Reference Standards , Reproducibility of Results
6.
Biomed Inform Insights ; 6(Suppl 1): 7-16, 2013.
Article in English | MEDLINE | ID: mdl-23847423

ABSTRACT

A large amount of medication information resides in the unstructured text found in electronic medical records, which requires advanced techniques to be properly mined. In clinical notes, medication information follows certain semantic patterns (eg, medication, dosage, frequency, and mode). Some medication descriptions contain additional word(s) between medication attributes. Therefore, it is essential to understand the semantic patterns as well as the patterns of the context interspersed among them (ie, context patterns) to effectively extract comprehensive medication information. In this paper we examined both semantic and context patterns, and compared those found in Mayo Clinic and i2b2 challenge data. We found that some variations exist between the institutions but the dominant patterns are common.

7.
J Biomed Semantics ; 2 Suppl 3: S2, 2011.
Article in English | MEDLINE | ID: mdl-21992591

ABSTRACT

BACKGROUND: Extracting medication information from clinical records has many potential applications, and recently published research, systems, and competitions reflect an interest therein. Much of the early extraction work involved rules and lexicons, but more recently machine learning has been applied to the task. METHODS: We present a hybrid system consisting of two parts. The first part, field detection, uses a cascade of statistical classifiers to identify medication-related named entities. The second part uses simple heuristics to link those entities into medication events. RESULTS: The system achieved performance that is comparable to other approaches to the same task. This performance is further improved by adding features that reference external medication name lists. CONCLUSIONS: This study demonstrates that our hybrid approach outperforms purely statistical or rule-based systems. The study also shows that a cascade of classifiers works better than a single classifier in extracting medication information. The system is available as is upon request from the first author.

SELECTION OF CITATIONS
SEARCH DETAIL
...