Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 8 de 8
Filter
1.
Clin J Am Soc Nephrol ; 18(3): 400-401, 2023 03 01.
Article in English | MEDLINE | ID: mdl-36763809
2.
Adv Chronic Kidney Dis ; 29(5): 465-471, 2022 09.
Article in English | MEDLINE | ID: mdl-36253030

ABSTRACT

Unstructured data in the electronic health records contain essential patient information. Natural language processing (NLP), teaching a computer to read, allows us to tap into these data without needing the time and effort of manual chart abstraction. The core first step for all NLP algorithms is preprocessing the text to identify the core words that differentiate the text while filtering out the noise. Traditional NLP uses a rule-based approach, applying grammatical rules to infer meaning from the text. Newer NLP approaches use machine learning/deep learning which can infer meaning without explicitly being programmed. NLP use in nephrology research has focused on identifying distinct disease processes, such as CKD, and extraction of patient-oriented outcomes such as symptoms with high sensitivity. NLP can identify patient features from clinical text associated with acute kidney injury and progression of CKD. Lastly, inclusion of features extracted using NLP improved the performance of risk-prediction models compared to models that only use structured data. Implementation of NLP algorithms has been slow, partially hindered by the lack of external validation of NLP algorithms. However, NLP allows for extraction of key patient characteristics from free text, an infrequently used resource in nephrology.


Subject(s)
Nephrology , Renal Insufficiency, Chronic , Algorithms , Electronic Health Records , Humans , Natural Language Processing , Renal Insufficiency, Chronic/therapy
3.
Am J Hum Genet ; 108(12): 2301-2318, 2021 12 02.
Article in English | MEDLINE | ID: mdl-34762822

ABSTRACT

Identifying whether a given genetic mutation results in a gene product with increased (gain-of-function; GOF) or diminished (loss-of-function; LOF) activity is an important step toward understanding disease mechanisms because they may result in markedly different clinical phenotypes. Here, we generated an extensive database of documented germline GOF and LOF pathogenic variants by employing natural language processing (NLP) on the available abstracts in the Human Gene Mutation Database. We then investigated various gene- and protein-level features of GOF and LOF variants and applied machine learning and statistical analyses to identify discriminative features. We found that GOF variants were enriched in essential genes, for autosomal-dominant inheritance, and in protein binding and interaction domains, whereas LOF variants were enriched in singleton genes, for protein-truncating variants, and in protein core regions. We developed a user-friendly web-based interface that enables the extraction of selected subsets from the GOF/LOF database by a broad set of annotated features and downloading of up-to-date versions. These results improve our understanding of how variants affect gene/protein function and may ultimately guide future treatment options.


Subject(s)
Databases, Genetic , Gain of Function Mutation , Loss of Function Mutation , Proteins/genetics , Cloud Computing , Genetic Predisposition to Disease , Genome, Human , Germ-Line Mutation , Humans , Internet-Based Intervention , Machine Learning
4.
Blood Purif ; 50(4-5): 621-627, 2021.
Article in English | MEDLINE | ID: mdl-33631752

ABSTRACT

BACKGROUND/AIMS: Acute kidney injury (AKI) in critically ill patients is common, and continuous renal replacement therapy (CRRT) is a preferred mode of renal replacement therapy (RRT) in hemodynamically unstable patients. Prediction of clinical outcomes in patients on CRRT is challenging. We utilized several approaches to predict RRT-free survival (RRTFS) in critically ill patients with AKI requiring CRRT. METHODS: We used the Medical Information Mart for Intensive Care (MIMIC-III) database to identify patients ≥18 years old with AKI on CRRT, after excluding patients who had ESRD on chronic dialysis, and kidney transplantation. We defined RRTFS as patients who were discharged alive and did not require RRT ≥7 days prior to hospital discharge. We utilized all available biomedical data up to CRRT initiation. We evaluated 7 approaches, including logistic regression (LR), random forest (RF), support vector machine (SVM), adaptive boosting (AdaBoost), extreme gradient boosting (XGBoost), multilayer perceptron (MLP), and MLP with long short-term memory (MLP + LSTM). We evaluated model performance by using area under the receiver operating characteristic (AUROC) curves. RESULTS: Out of 684 patients with AKI on CRRT, 205 (30%) patients had RRTFS. The median age of patients was 63 years and their median Simplified Acute Physiology Score (SAPS) II was 67 (interquartile range 52-84). The MLP + LSTM showed the highest AUROC (95% CI) of 0.70 (0.67-0.73), followed by MLP 0.59 (0.54-0.64), LR 0.57 (0.52-0.62), SVM 0.51 (0.46-0.56), AdaBoost 0.51 (0.46-0.55), RF 0.44 (0.39-0.48), and XGBoost 0.43 (CI 0.38-0.47). CONCLUSIONS: A MLP + LSTM model outperformed other approaches for predicting RRTFS. Performance could be further improved by incorporating other data types.


Subject(s)
Acute Kidney Injury/therapy , Renal Replacement Therapy , Acute Kidney Injury/diagnosis , Age Factors , Aged , Critical Care , Female , Humans , Logistic Models , Machine Learning , Male , Middle Aged , Prognosis
5.
Int J Med Inform ; 129: 334-341, 2019 09.
Article in English | MEDLINE | ID: mdl-31445275

ABSTRACT

OBJECTIVE: Electronic health record (EHR) systems contain structured data (such as diagnostic codes) and unstructured data (clinical documentation). Clinical insights can be derived from analyzing both. The use of natural language processing (NLP) algorithms to effectively analyze unstructured data has been well demonstrated. Here we examine the utility of NLP for the identification of patients with non-alcoholic fatty liver disease, assess patterns of disease progression, and identify gaps in care related to breakdown in communication among providers. MATERIALS AND METHODS: All clinical notes available on the 38,575 patients enrolled in the Mount Sinai BioMe cohort were loaded into the NLP system. We compared analysis of structured and unstructured EHR data using NLP, free-text search, and diagnostic codes with validation against expert adjudication. We then used the NLP findings to measure physician impression of progression from early-stage NAFLD to NASH or cirrhosis. Similarly, we used the same NLP findings to identify mentions of NAFLD in radiology reports that did not persist into clinical notes. RESULTS: Out of 38,575 patients, we identified 2,281 patients with NAFLD. From the remainder, 10,653 patients with similar data density were selected as a control group. NLP outperformed ICD and text search in both sensitivity (NLP: 0.93, ICD: 0.28, text search: 0.81) and F2 score (NLP: 0.92, ICD: 0.34, text search: 0.81). Of 2281 NAFLD patients, 673 (29.5%) were believed to have progressed to NASH or cirrhosis. Among 176 where NAFLD was noted prior to NASH, the average progression time was 410 days. 619 (27.1%) NAFLD patients had it documented only in radiology notes and not acknowledged in other forms of clinical documentation. Of these, 170 (28.4%) were later identified as having likely developed NASH or cirrhosis after a median 1057.3 days. DISCUSSION: NLP-based approaches were more accurate at identifying NAFLD within the EHR than ICD/text search-based approaches. Suspected NAFLD on imaging is often not acknowledged in subsequent clinical documentation. Many such patients are later found to have more advanced liver disease. Analysis of information flows demonstrated loss of key information that could have been used to help prevent the progression of early NAFLD (NAFL) to NASH or cirrhosis. CONCLUSION: For identification of NAFLD, NLP performed better than alternative selection modalities. It then facilitated analysis of knowledge flow between physician and enabled the identification of breakdowns where key information was lost that could have slowed or prevented later disease progression.


Subject(s)
Electronic Health Records , Natural Language Processing , Non-alcoholic Fatty Liver Disease/diagnosis , Algorithms , Cohort Studies , Disease Progression , Female , Humans , Male , Middle Aged
6.
AMIA Annu Symp Proc ; 2010: 817-21, 2010 Nov 13.
Article in English | MEDLINE | ID: mdl-21347092

ABSTRACT

Physicians have access to patient notes in volumes far greater than what is practical to read within the context of a standard clinical scenario. As a preliminary step toward being able to provide a longitudinal summary of patient history, methods are examined for the automated extraction of relevant patient problems from existing clinical notes. We explore a grounded approach to identifying important patient problems from patient history. Methods build on existing NLP and text-summarization methodologies and leverage features observed in a relevant corpus.


Subject(s)
Electronic Health Records , Physicians , Humans , Natural Language Processing
7.
AMIA Annu Symp Proc ; : 753-7, 2008 Nov 06.
Article in English | MEDLINE | ID: mdl-18999284

ABSTRACT

In the interest of designing an automated high-level, longitudinal clinical summary of a patient record, we analyze traditional ways in which medical problems pertaining to the patient are summarized in the electronic health record. The patient problem list has become a commonly used proxy for a summary of patient history and automated methods have been proposed to generate it. However, little research has been conducted on how to structure the problem list in a manner most effective for supporting clinical care. This study analyzes the structure and content of the Past Medical History (PMH) sections of a large corpus of clinical notes, as a proxy for problem lists. Findings show that when listing patients history, physicians convey several semantic types of information, not only problems. Furthermore, they often group related concepts in a single line of the PMH. In contrast, traditional problem lists allow only a simple enumeration of coded terms. Content analysis goes on to reiterate the value of more complex representations as well as provide valuable data and guidelines for automated generation of a clinical summary.


Subject(s)
Information Storage and Retrieval/methods , Medical History Taking/methods , Medical Records Systems, Computerized/statistics & numerical data , Medical Records, Problem-Oriented/statistics & numerical data , Natural Language Processing , Pattern Recognition, Automated/methods , Algorithms , Artificial Intelligence , Clinical Protocols , New York , Subject Headings
8.
AMIA Annu Symp Proc ; : 761-5, 2007 Oct 11.
Article in English | MEDLINE | ID: mdl-18693939

ABSTRACT

Clinicians perform many tasks in their daily work requiring summarization of clinical data. However, as technology makes more data available, the challenges of data overload become ever more significant. As interoperable data exchange between hospitals becomes more common, there is an increased need for tools to summarize information. Our goal is to develop automated tools to aid clinical data summarization. Structured interviews were conducted on physicians to identify information from an electronic health record they considered relevant to explaining the patients medical history. Desirable data types were systematically evaluated using qualitative and quantitative analysis to assess data categories and patterns of data use. We report here on the implications of these results for the design of automated tools for summarization of patient history.


Subject(s)
Attitude of Health Personnel , Medical Records Systems, Computerized , Documentation/standards , Electronic Data Processing , Humans , Interviews as Topic , Medical Record Linkage , Medical Records Systems, Computerized/standards
SELECTION OF CITATIONS
SEARCH DETAIL
...