Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 18 de 18
Filter
1.
J Am Heart Assoc ; 7(20): e09841, 2018 10 16.
Article in English | MEDLINE | ID: mdl-30371257

ABSTRACT

Background Heart failure ( HF ) with "recovered" ejection fraction ( HF rec EF ) is an emerging phenotype, but no tools exist to predict ejection fraction ( EF ) recovery in acute HF . We hypothesized that indices of baseline cardiac structure and function predict HF rec EF in nonischemic cardiomyopathy and reduced EF . Methods and Results We identified a nonischemic cardiomyopathy cohort with EF <40% during the first HF hospitalization (n=166). We performed speckle-tracking echocardiography to measure longitudinal, circumferential, and radial strain, and the average of these measures (myocardial systolic performance). HF rec EF was defined as follow-up EF ≥40% and ≥10% improvement from baseline EF . Fifty-nine patients (36%) achieved HF rec EF (baseline EF 26±7%; follow-up EF 51±7%) within a median of 135 (interquartile range 58-239) days after the first HF hospitalization. Baseline demographics, biomarker profiles, and comorbid conditions (except lower chronic kidney disease in HF rec EF ) were similar between HF rec EF and persistent reduced- EF groups. HF rec EF patients had smaller baseline left ventricular end-systolic dimension (3.6 versus 4.8 cm; P<0.01), higher baseline myocardial systolic performance (9.2% versus 8.1%; P=0.02), and improved survival (adjusted hazard ratio 0.27, 95% confidence interval 0.11, 0.62). We found a significant interaction between baseline left ventricular end-systolic dimension and absolute longitudinal strain. Among patients with left ventricular end-systolic dimension >4.35 cm, higher absolute longitudinal strain (≥8%) was associated with HF rec EF (unadjusted odds ratio=3.9, 95% CI )confidence interval 1.2, 12.8). Incorporation of baseline indices of cardiac mechanics with clinical variables resulted in a predictive model for HF rec EF with c-statistic=0.85. Conclusions Factors associated with achieving HF rec EF were specific to cardiac structure and indices of cardiac mechanics. Higher baseline absolute longitudinal strain is associated with HF rec EF among nonischemic cardiomyopathy patients with reduced EF and larger left ventricular dimensions.


Subject(s)
Cardiomyopathies/physiopathology , Heart Failure/physiopathology , Cardiomyopathies/therapy , Echocardiography , Female , Heart Failure/therapy , Hospitalization/statistics & numerical data , Humans , Kaplan-Meier Estimate , Longitudinal Studies , Male , Middle Aged , Treatment Outcome , Ventricular Dysfunction, Left/physiopathology
2.
J Cardiovasc Transl Res ; 10(3): 313-321, 2017 Jun.
Article in English | MEDLINE | ID: mdl-28585184

ABSTRACT

Precision medicine requires clinical trials that are able to efficiently enroll subtypes of patients in whom targeted therapies can be tested. To reduce the large amount of time spent screening, identifying, and recruiting patients with specific subtypes of heterogeneous clinical syndromes (such as heart failure with preserved ejection fraction [HFpEF]), we need prescreening systems that are able to automate data extraction and decision-making tasks. However, a major obstacle is the vast amount of unstructured free-form text in medical records. Here we describe an information extraction-based approach that automatically converts unstructured text into structured data, which is cross-referenced against eligibility criteria using a rule-based system to determine which patients qualify for a major HFpEF clinical trial (PARAGON). We show that we can achieve a sensitivity and positive predictive value of 0.95 and 0.86, respectively. Our open-source algorithm could be used to efficiently identify and subphenotype patients with HFpEF and other disorders.


Subject(s)
Clinical Trials as Topic/methods , Data Mining/methods , Electronic Health Records , Eligibility Determination/methods , Heart Failure/physiopathology , Natural Language Processing , Patient Selection , Stroke Volume , Algorithms , Echocardiography , Heart Failure/classification , Heart Failure/diagnosis , Heart Failure/therapy , Humans , Phenotype , Predictive Value of Tests , Reproducibility of Results
3.
Drug Saf ; 40(11): 1075-1089, 2017 11.
Article in English | MEDLINE | ID: mdl-28643174

ABSTRACT

The goal of pharmacovigilance is to detect, monitor, characterize and prevent adverse drug events (ADEs) with pharmaceutical products. This article is a comprehensive structured review of recent advances in applying natural language processing (NLP) to electronic health record (EHR) narratives for pharmacovigilance. We review methods of varying complexity and problem focus, summarize the current state-of-the-art in methodology advancement, discuss limitations and point out several promising future directions. The ability to accurately capture both semantic and syntactic structures in clinical narratives becomes increasingly critical to enable efficient and accurate ADE detection. Significant progress has been made in algorithm development and resource construction since 2000. Since 2012, statistical analysis and machine learning methods have gained traction in automation of ADE mining from EHR narratives. Current state-of-the-art methods for NLP-based ADE detection from EHRs show promise regarding their integration into production pharmacovigilance systems. In addition, integrating multifaceted, heterogeneous data sources has shown promise in improving ADE detection and has become increasingly adopted. On the other hand, challenges and opportunities remain across the frontier of NLP application to EHR-based pharmacovigilance, including proper characterization of ADE context, differentiation between off- and on-label drug-use ADEs, recognition of the importance of polypharmacy-induced ADEs, better integration of heterogeneous data sources, creation of shared corpora, and organization of shared-task challenges to advance the state-of-the-art.


Subject(s)
Adverse Drug Reaction Reporting Systems/standards , Drug-Related Side Effects and Adverse Reactions/diagnosis , Electronic Health Records/standards , Natural Language Processing , Pharmacovigilance , Humans
4.
AMIA Jt Summits Transl Sci Proc ; 2016: 203-12, 2016.
Article in English | MEDLINE | ID: mdl-27570671

ABSTRACT

Precision Medicine is an emerging approach for prevention and treatment of disease that considers individual variability in genes, environment, and lifestyle for each person. The dissemination of individualized evidence by automatically identifying population information in literature is a key for evidence-based precision medicine at the point-of-care. We propose a hybrid approach using natural language processing techniques to automatically extract the population information from biomedical literature. Our approach first implements a binary classifier to classify sentences with or without population information. A rule-based system based on syntactic-tree regular expressions is then applied to sentences containing population information to extract the population named entities. The proposed two-stage approach achieved an F-score of 0.81 using a MaxEnt classifier and the rule- based system, and an F-score of 0.87 using a Nai've-Bayes classifier and the rule-based system, and performed relatively well compared to many existing systems. The system and evaluation dataset is being released as open source.

5.
PLoS One ; 11(4): e0153749, 2016.
Article in English | MEDLINE | ID: mdl-27124000

ABSTRACT

Large volumes of data are continuously generated from clinical notes and diagnostic studies catalogued in electronic health records (EHRs). Echocardiography is one of the most commonly ordered diagnostic tests in cardiology. This study sought to explore the feasibility and reliability of using natural language processing (NLP) for large-scale and targeted extraction of multiple data elements from echocardiography reports. An NLP tool, EchoInfer, was developed to automatically extract data pertaining to cardiovascular structure and function from heterogeneously formatted echocardiographic data sources. EchoInfer was applied to echocardiography reports (2004 to 2013) available from 3 different on-going clinical research projects. EchoInfer analyzed 15,116 echocardiography reports from 1684 patients, and extracted 59 quantitative and 21 qualitative data elements per report. EchoInfer achieved a precision of 94.06%, a recall of 92.21%, and an F1-score of 93.12% across all 80 data elements in 50 reports. Physician review of 400 reports demonstrated that EchoInfer achieved a recall of 92-99.9% and a precision of >97% in four data elements, including three quantitative and one qualitative data element. Failure of EchoInfer to correctly identify or reject reported parameters was primarily related to non-standardized reporting of echocardiography data. EchoInfer provides a powerful and reliable NLP-based approach for the large-scale, targeted extraction of information from heterogeneous data sources. The use of EchoInfer may have implications for the clinical management and research analysis of patients undergoing echocardiographic evaluation.


Subject(s)
Echocardiography/methods , Natural Language Processing , Aged , Electronic Health Records , Female , Humans , Information Storage and Retrieval , Male , Reproducibility of Results
6.
J Biomed Inform ; 60: 14-22, 2016 Apr.
Article in English | MEDLINE | ID: mdl-26774763

ABSTRACT

UNLABELLED: Most patient care questions raised by clinicians can be answered by online clinical knowledge resources. However, important barriers still challenge the use of these resources at the point of care. OBJECTIVE: To design and assess a method for extracting clinically useful sentences from synthesized online clinical resources that represent the most clinically useful information for directly answering clinicians' information needs. MATERIALS AND METHODS: We developed a Kernel-based Bayesian Network classification model based on different domain-specific feature types extracted from sentences in a gold standard composed of 18 UpToDate documents. These features included UMLS concepts and their semantic groups, semantic predications extracted by SemRep, patient population identified by a pattern-based natural language processing (NLP) algorithm, and cue words extracted by a feature selection technique. Algorithm performance was measured in terms of precision, recall, and F-measure. RESULTS: The feature-rich approach yielded an F-measure of 74% versus 37% for a feature co-occurrence method (p<0.001). Excluding predication, population, semantic concept or text-based features reduced the F-measure to 62%, 66%, 58% and 69% respectively (p<0.01). The classifier applied to Medline sentences reached an F-measure of 73%, which is equivalent to the performance of the classifier on UpToDate sentences (p=0.62). CONCLUSIONS: The feature-rich approach significantly outperformed general baseline methods. This approach significantly outperformed classifiers based on a single type of feature. Different types of semantic features provided a unique contribution to overall classification performance. The classifier's model and features used for UpToDate generalized well to Medline abstracts.


Subject(s)
Decision Support Systems, Clinical , Information Storage and Retrieval/methods , Supervised Machine Learning , Algorithms , Bayes Theorem , Humans , Language , MEDLINE , Natural Language Processing , Semantics , Terminology as Topic , Unified Medical Language System
7.
J Med Internet Res ; 18(1): e11, 2016 Jan 13.
Article in English | MEDLINE | ID: mdl-26764193

ABSTRACT

BACKGROUND: An increasing number of people visit online health communities to seek health information. In these communities, people share experiences and information with others, often complemented with links to different websites. Understanding how people share websites can help us understand patients' needs in online health communities and improve how peer patients share health information online. OBJECTIVE: Our goal was to understand (1) what kinds of websites are shared, (2) information quality of the shared websites, (3) who shares websites, (4) community differences in website-sharing behavior, and (5) the contexts in which patients share websites. We aimed to find practical applications and implications of website-sharing practices in online health communities. METHODS: We used regular expressions to extract URLs from 10 WebMD online health communities. We then categorized the URLs based on their top-level domains. We counted the number of trust codes (eg, accredited agencies' formal evaluation and PubMed authors' institutions) for each website to assess information quality. We used descriptive statistics to determine website-sharing activities. To understand the context of the URL being discussed, we conducted a simple random selection of 5 threads that contained at least one post with URLs from each community. Gathering all other posts in these threads resulted in 387 posts for open coding analysis with the goal of understanding motivations and situations in which website sharing occurred. RESULTS: We extracted a total of 25,448 websites. The majority of the shared websites were .com (59.16%, 15,056/25,448) and WebMD internal (23.2%, 5905/25,448) websites; the least shared websites were social media websites (0.15%, 39/25,448). High-posting community members and moderators posted more websites with trust codes than low-posting community members did. The heart disease community had the highest percentage of websites containing trust codes compared to other communities. Members used websites to disseminate information, supportive evidence, resources for social support, and other ways to communicate. CONCLUSIONS: Online health communities can be used as important health care information resources for patients and caregivers. Our findings inform patients' health information-sharing activities. This information assists health care providers, informaticians, and online health information entrepreneurs and developers in helping patients and caregivers make informed choices.


Subject(s)
Consumer Health Information , Internet , Social Support , Health Personnel , Humans , Internet/standards
8.
Int J Med Inform ; 86: 126-34, 2016 Feb.
Article in English | MEDLINE | ID: mdl-26612774

ABSTRACT

OBJECTIVE: To iteratively design a prototype of a computerized clinical knowledge summarization (CKS) tool aimed at helping clinicians finding answers to their clinical questions; and to conduct a formative assessment of the usability, usefulness, efficiency, and impact of the CKS prototype on physicians' perceived decision quality compared with standard search of UpToDate and PubMed. MATERIALS AND METHODS: Mixed-methods observations of the interactions of 10 physicians with the CKS prototype vs. standard search in an effort to solve clinical problems posed as case vignettes. RESULTS: The CKS tool automatically summarizes patient-specific and actionable clinical recommendations from PubMed (high quality randomized controlled trials and systematic reviews) and UpToDate. Two thirds of the study participants completed 15 out of 17 usability tasks. The median time to task completion was less than 10s for 12 of the 17 tasks. The difference in search time between the CKS and standard search was not significant (median=4.9 vs. 4.5m in). Physician's perceived decision quality was significantly higher with the CKS than with manual search (mean=16.6 vs. 14.4; p=0.036). CONCLUSIONS: The CKS prototype was well-accepted by physicians both in terms of usability and usefulness. Physicians perceived better decision quality with the CKS prototype compared to standard search of PubMed and UpToDate within a similar search time. Due to the formative nature of this study and a small sample size, conclusions regarding efficiency and efficacy are exploratory.


Subject(s)
Decision Support Systems, Clinical/statistics & numerical data , Knowledge Management/standards , Medical Record Linkage , Patient-Specific Modeling , Humans , Pattern Recognition, Automated , Problem Solving , Systems Integration
9.
J Biomed Inform ; 58 Suppl: S120-S127, 2015 Dec.
Article in English | MEDLINE | ID: mdl-26209007

ABSTRACT

This paper describes the use of an agile text mining platform (Linguamatics' Interactive Information Extraction Platform, I2E) to extract document-level cardiac risk factors in patient records as defined in the i2b2/UTHealth 2014 challenge. The approach uses a data-driven rule-based methodology with the addition of a simple supervised classifier. We demonstrate that agile text mining allows for rapid optimization of extraction strategies, while post-processing can leverage annotation guidelines, corpus statistics and logic inferred from the gold standard data. We also show how data imbalance in a training set affects performance. Evaluation of this approach on the test data gave an F-Score of 91.7%, one percent behind the top performing system.


Subject(s)
Cardiovascular Diseases/epidemiology , Data Mining/methods , Diabetes Complications/epidemiology , Electronic Health Records/organization & administration , Narration , Natural Language Processing , Aged , Cardiovascular Diseases/diagnosis , Cohort Studies , Comorbidity , Computer Security , Confidentiality , Diabetes Complications/diagnosis , Female , Humans , Incidence , Longitudinal Studies , Male , Middle Aged , Pattern Recognition, Automated/methods , Risk Assessment/methods , United Kingdom/epidemiology , Vocabulary, Controlled
10.
Syst Rev ; 4: 78, 2015 Jun 15.
Article in English | MEDLINE | ID: mdl-26073888

ABSTRACT

BACKGROUND: Automation of the parts of systematic review process, specifically the data extraction step, may be an important strategy to reduce the time necessary to complete a systematic review. However, the state of the science of automatically extracting data elements from full texts has not been well described. This paper performs a systematic review of published and unpublished methods to automate data extraction for systematic reviews. METHODS: We systematically searched PubMed, IEEEXplore, and ACM Digital Library to identify potentially relevant articles. We included reports that met the following criteria: 1) methods or results section described what entities were or need to be extracted, and 2) at least one entity was automatically extracted with evaluation results that were presented for that entity. We also reviewed the citations from included reports. RESULTS: Out of a total of 1190 unique citations that met our search criteria, we found 26 published reports describing automatic extraction of at least one of more than 52 potential data elements used in systematic reviews. For 25 (48 %) of the data elements used in systematic reviews, there were attempts from various researchers to extract information automatically from the publication text. Out of these, 14 (27 %) data elements were completely extracted, but the highest number of data elements extracted automatically by a single study was 7. Most of the data elements were extracted with F-scores (a mean of sensitivity and positive predictive value) of over 70 %. CONCLUSIONS: We found no unified information extraction framework tailored to the systematic review process, and published reports focused on a limited (1-7) number of data elements. Biomedical natural language processing techniques have not been fully utilized to fully or even partially automate the data extraction step of systematic reviews.


Subject(s)
Data Mining/methods , Publishing , Review Literature as Topic , Humans , Information Storage and Retrieval , Research Report
11.
Methods Mol Biol ; 1159: 147-57, 2014.
Article in English | MEDLINE | ID: mdl-24788266

ABSTRACT

The combination of scientific knowledge and experience is the key success for biomedical research. This chapter demonstrates some of the strategies used to help in identifying key opinion leaders with the expertise you need, thus enabling an effort to increase collaborative biomedical research.


Subject(s)
Biomedical Research , Expert Testimony , Natural Language Processing , Social Support
12.
AMIA Annu Symp Proc ; 2014: 757-66, 2014.
Article in English | MEDLINE | ID: mdl-25954382

ABSTRACT

Point of care access to knowledge from full text journal articles supports decision-making and decreases medical errors. However, it is an overwhelming task to search through full text journal articles and find quality information needed by clinicians. We developed a method to rate journals for a given clinical topic, Congestive Heart Failure (CHF). Our method enables filtering of journals and ranking of journal articles based on source journal in relation to CHF. We also obtained a journal priority score, which automatically rates any journal based on its importance to CHF. Comparing our ranking with data gathered by surveying 169 cardiologists, who publish on CHF, our best Multiple Linear Regression model showed a correlation of 0.880, based on five-fold cross validation. Our ranking system can be extended to other clinical topics.


Subject(s)
Bibliometrics , Decision Making , Heart Failure , Periodicals as Topic/classification , Cardiology , Humans , Journal Impact Factor , Linear Models
13.
Article in English | MEDLINE | ID: mdl-25954582

ABSTRACT

The goal of this paper is to find relevant citations for clinicians' written content and make it more reliable by adding scientific articles as references and enabling the clinicians to easily update it using new information. The proposed approach uses information retrieval and ranking techniques to extract and rank relevant citations from MEDLINE for any given sentence. Additionally, this system extracts snippets of relevant content from ranked citations. We assessed our approach on 4,697 MEDLINE papers and their corresponding full-text on the subject of Heart Failure. We implemented multi-level and weight ranking algorithms to rank the citations. We demonstrate that using journal relevance and study design type improves results obtained from only using content similarity by approximately 40%. We also show that using full-text, rather than abstract text, leads to extracting higher quality snippets.

14.
AMIA Jt Summits Transl Sci Proc ; 2013: 149-53, 2013.
Article in English | MEDLINE | ID: mdl-24303255

ABSTRACT

Information extraction (IE), a natural language processing (NLP) task that automatically extracts structured or semi-structured information from free text, has become popular in the clinical domain for supporting automated systems at point-of-care and enabling secondary use of electronic health records (EHRs) for clinical and translational research. However, a high performance IE system can be very challenging to construct due to the complexity and dynamic nature of human language. In this paper, we report an IE framework for cohort identification using EHRs that is a knowledge-driven framework developed under the Unstructured Information Management Architecture (UIMA). A system to extract specific information can be developed by subject matter experts through expert knowledge engineering of the externalized knowledge resources used in the framework.

15.
Ann Allergy Asthma Immunol ; 111(5): 364-9, 2013 Nov.
Article in English | MEDLINE | ID: mdl-24125142

ABSTRACT

BACKGROUND: A significant proportion of children with asthma have delayed diagnosis of asthma by health care providers. Manual chart review according to established criteria is more accurate than directly using diagnosis codes, which tend to under-identify asthmatics, but chart reviews are more costly and less timely. OBJECTIVE: To evaluate the accuracy of a computational approach to asthma ascertainment, characterizing its utility and feasibility toward large-scale deployment in electronic medical records. METHODS: A natural language processing (NLP) system was developed for extracting predetermined criteria for asthma from unstructured text in electronic medical records and then inferring asthma status based on these criteria. Using manual chart reviews as a gold standard, asthma status (yes vs no) and identification date (first date of a "yes" asthma status) were determined by the NLP system. RESULTS: Patients were a group of children (n = 112, 84% Caucasian, 49% girls) younger than 4 years (mean 2.0 years, standard deviation 1.03 years) who participated in previous studies. The NLP approach to asthma ascertainment showed sensitivity, specificity, positive predictive value, negative predictive value, and median delay in diagnosis of 84.6%, 96.5%, 88.0%, 95.4%, and 0 months, respectively; this compared favorably with diagnosis codes, at 30.8%, 93.2%, 57.1%, 82.2%, and 2.3 months, respectively. CONCLUSION: Automated asthma ascertainment from electronic medical records using NLP is feasible and more accurate than traditional approaches such as diagnosis codes. Considering the difficulty of labor-intensive manual record review, NLP approaches for asthma ascertainment should be considered for improving clinical care and research, especially in large-scale efforts.


Subject(s)
Asthma/diagnosis , Electronic Data Processing , Medical Records Systems, Computerized , Natural Language Processing , Child, Preschool , Cohort Studies , Female , Humans , Male
16.
Biomed Inform Insights ; 6(Suppl 1): 7-16, 2013.
Article in English | MEDLINE | ID: mdl-23847423

ABSTRACT

A large amount of medication information resides in the unstructured text found in electronic medical records, which requires advanced techniques to be properly mined. In clinical notes, medication information follows certain semantic patterns (eg, medication, dosage, frequency, and mode). Some medication descriptions contain additional word(s) between medication attributes. Therefore, it is essential to understand the semantic patterns as well as the patterns of the context interspersed among them (ie, context patterns) to effectively extract comprehensive medication information. In this paper we examined both semantic and context patterns, and compared those found in Mayo Clinic and i2b2 challenge data. We found that some variations exist between the institutions but the dominant patterns are common.

17.
J Am Med Inform Assoc ; 20(5): 836-42, 2013.
Article in English | MEDLINE | ID: mdl-23558168

ABSTRACT

BACKGROUND: Temporal information detection systems have been developed by the Mayo Clinic for the 2012 i2b2 Natural Language Processing Challenge. OBJECTIVE: To construct automated systems for EVENT/TIMEX3 extraction and temporal link (TLINK) identification from clinical text. MATERIALS AND METHODS: The i2b2 organizers provided 190 annotated discharge summaries as the training set and 120 discharge summaries as the test set. Our Event system used a conditional random field classifier with a variety of features including lexical information, natural language elements, and medical ontology. The TIMEX3 system employed a rule-based method using regular expression pattern match and systematic reasoning to determine normalized values. The TLINK system employed both rule-based reasoning and machine learning. All three systems were built in an Apache Unstructured Information Management Architecture framework. RESULTS: Our TIMEX3 system performed the best (F-measure of 0.900, value accuracy 0.731) among the challenge teams. The Event system produced an F-measure of 0.870, and the TLINK system an F-measure of 0.537. CONCLUSIONS: Our TIMEX3 system demonstrated good capability of regular expression rules to extract and normalize time information. Event and TLINK machine learning systems required well-defined feature sets to perform well. We could also leverage expert knowledge as part of the machine learning features to further improve TLINK identification performance.


Subject(s)
Artificial Intelligence , Electronic Health Records , Information Storage and Retrieval/methods , Natural Language Processing , Patient Discharge Summaries , Humans , Time
18.
J Biomed Semantics ; 4(1): 3, 2013 Jan 08.
Article in English | MEDLINE | ID: mdl-23294871

ABSTRACT

BACKGROUND: The availability of annotated corpora has facilitated the application of machine learning algorithms to concept extraction from clinical notes. However, high expenditure and labor are required for creating the annotations. A potential alternative is to reuse existing corpora from other institutions by pooling with local corpora, for training machine taggers. In this paper we have investigated the latter approach by pooling corpora from 2010 i2b2/VA NLP challenge and Mayo Clinic Rochester, to evaluate taggers for recognition of medical problems. The corpora were annotated for medical problems, but with different guidelines. The taggers were constructed using an existing tagging system MedTagger that consisted of dictionary lookup, part of speech (POS) tagging and machine learning for named entity prediction and concept extraction. We hope that our current work will be a useful case study for facilitating reuse of annotated corpora across institutions. RESULTS: We found that pooling was effective when the size of the local corpus was small and after some of the guideline differences were reconciled. The benefits of pooling, however, diminished as more locally annotated documents were included in the training data. We examined the annotation guidelines to identify factors that determine the effect of pooling. CONCLUSIONS: The effectiveness of pooling corpora, is dependent on several factors, which include compatibility of annotation guidelines, distribution of report types and size of local and foreign corpora. Simple methods to rectify some of the guideline differences can facilitate pooling. Our findings need to be confirmed with further studies on different corpora. To facilitate the pooling and reuse of annotated corpora, we suggest that - i) the NLP community should develop a standard annotation guideline that addresses the potential areas of guideline differences that are partly identified in this paper; ii) corpora should be annotated with a two-pass method that focuses first on concept recognition, followed by normalization to existing ontologies; and iii) metadata such as type of the report should be created during the annotation process.

SELECTION OF CITATIONS
SEARCH DETAIL
...