Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 39
Filter
1.
J Med Internet Res ; 26: e60501, 2024 Sep 10.
Article in English | MEDLINE | ID: mdl-39255030

ABSTRACT

BACKGROUND: Prompt engineering, focusing on crafting effective prompts to large language models (LLMs), has garnered attention for its capabilities at harnessing the potential of LLMs. This is even more crucial in the medical domain due to its specialized terminology and language technicity. Clinical natural language processing applications must navigate complex language and ensure privacy compliance. Prompt engineering offers a novel approach by designing tailored prompts to guide models in exploiting clinically relevant information from complex medical texts. Despite its promise, the efficacy of prompt engineering in the medical domain remains to be fully explored. OBJECTIVE: The aim of the study is to review research efforts and technical approaches in prompt engineering for medical applications as well as provide an overview of opportunities and challenges for clinical practice. METHODS: Databases indexing the fields of medicine, computer science, and medical informatics were queried in order to identify relevant published papers. Since prompt engineering is an emerging field, preprint databases were also considered. Multiple data were extracted, such as the prompt paradigm, the involved LLMs, the languages of the study, the domain of the topic, the baselines, and several learning, design, and architecture strategies specific to prompt engineering. We include studies that apply prompt engineering-based methods to the medical domain, published between 2022 and 2024, and covering multiple prompt paradigms such as prompt learning (PL), prompt tuning (PT), and prompt design (PD). RESULTS: We included 114 recent prompt engineering studies. Among the 3 prompt paradigms, we have observed that PD is the most prevalent (78 papers). In 12 papers, PD, PL, and PT terms were used interchangeably. While ChatGPT is the most commonly used LLM, we have identified 7 studies using this LLM on a sensitive clinical data set. Chain-of-thought, present in 17 studies, emerges as the most frequent PD technique. While PL and PT papers typically provide a baseline for evaluating prompt-based approaches, 61% (48/78) of the PD studies do not report any nonprompt-related baseline. Finally, we individually examine each of the key prompt engineering-specific information reported across papers and find that many studies neglect to explicitly mention them, posing a challenge for advancing prompt engineering research. CONCLUSIONS: In addition to reporting on trends and the scientific landscape of prompt engineering, we provide reporting guidelines for future studies to help advance research in the medical field. We also disclose tables and figures summarizing medical prompt engineering papers available and hope that future contributions will leverage these existing works to better advance the field.


Subject(s)
Natural Language Processing , Humans , Medical Informatics/methods
2.
Stud Health Technol Inform ; 316: 1098-1102, 2024 Aug 22.
Article in English | MEDLINE | ID: mdl-39176573

ABSTRACT

White blood cell classification plays a key role in the diagnosis of hematologic diseases. Models can perform classification either from images or based on morphological features. Image-based classification generally yields higher performance, but feature-based classification is more interpretable for clinicians. In this study, we employed a Multimodal neural network to classify white blood cells, utilizing a combination of images and morphological features. We compared this approach with image-only and feature-only training. While the highest performance was achieved with image-only training, the Multimodal model provided enhanced interpretability by the computation of SHAP values, and revealed crucial morphological features for biological characterization of the cells.


Subject(s)
Leukocytes , Neural Networks, Computer , Humans , Leukocytes/classification , Leukocytes/cytology
3.
Stud Health Technol Inform ; 316: 1385-1389, 2024 Aug 22.
Article in English | MEDLINE | ID: mdl-39176639

ABSTRACT

Interoperability is crucial to overcoming various challenges of data integration in the healthcare domain. While OMOP and FHIR data standards handle syntactic heterogeneity among heterogeneous data sources, ontologies support semantic interoperability to overcome the complexity and disparity of healthcare data. This study proposes an ontological approach in the context of the EUCAIM project to support semantic interoperability among distributed big data repositories that have applied heterogeneous cancer image data models using a semantically well-founded Hyperontology for the oncology domain.


Subject(s)
Semantics , Humans , Biological Ontologies , Health Information Interoperability , Medical Oncology , Neoplasms , Big Data
5.
PLoS One ; 19(6): e0304789, 2024.
Article in English | MEDLINE | ID: mdl-38829858

ABSTRACT

Malaria is a deadly disease that is transmitted through mosquito bites. Microscopists use a microscope to examine thin blood smears at high magnification (1000x) to identify parasites in red blood cells (RBCs). Estimating parasitemia is essential in determining the severity of the Plasmodium falciparum infection and guiding treatment. However, this process is time-consuming, labor-intensive, and subject to variation, which can directly affect patient outcomes. In this retrospective study, we compared three methods for measuring parasitemia from a collection of anonymized thin blood smears of patients with Plasmodium falciparum obtained from the Clinical Department of Parasitology-Mycology, National Reference Center (NRC) for Malaria in Paris, France. We first analyzed the impact of the number of field images on parasitemia count using our framework, MALARIS, which features a top-classifier convolutional neural network (CNN). Additionally, we studied the variation between different microscopists using two manual techniques to demonstrate the need for a reliable and reproducible automated system. Finally, we included thin blood smear images from an additional 102 patients to compare the performance and correlation of our system with manual microscopy and flow cytometry. Our results showed strong correlations between the three methods, with a coefficient of determination between 0.87 and 0.92.


Subject(s)
Malaria, Falciparum , Microscopy , Parasitemia , Plasmodium falciparum , Humans , Plasmodium falciparum/isolation & purification , Parasitemia/diagnosis , Parasitemia/blood , Parasitemia/parasitology , Malaria, Falciparum/diagnosis , Malaria, Falciparum/blood , Malaria, Falciparum/parasitology , Retrospective Studies , Microscopy/methods , Erythrocytes/parasitology , Image Processing, Computer-Assisted/methods , Neural Networks, Computer , Flow Cytometry/methods
6.
Sci Adv ; 10(19): eadj6990, 2024 May 10.
Article in English | MEDLINE | ID: mdl-38728404

ABSTRACT

Mosquito-borne diseases like malaria are rising globally, and improved mosquito vector surveillance is needed. Survival of Anopheles mosquitoes is key for epidemiological monitoring of malaria transmission and evaluation of vector control strategies targeting mosquito longevity, as the risk of pathogen transmission increases with mosquito age. However, the available tools to estimate field mosquito age are often approximate and time-consuming. Here, we show a rapid method that combines matrix-assisted laser desorption/ionization-time-of-flight mass spectrometry with deep learning for mosquito age prediction. Using 2763 mass spectra from the head, legs, and thorax of 251 field-collected Anopheles arabiensis mosquitoes, we developed deep learning models that achieved a best mean absolute error of 1.74 days. We also demonstrate consistent performance at two ecological sites in Senegal, supported by age-related protein changes. Our approach is promising for malaria control and the field of vector biology, benefiting other disease vectors like Aedes mosquitoes.


Subject(s)
Anopheles , Deep Learning , Mosquito Vectors , Animals , Anopheles/physiology , Mosquito Vectors/physiology , Malaria/transmission , Malaria/prevention & control , Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization/methods , Senegal , Mass Spectrometry/methods , Aging/physiology
7.
J Am Med Inform Assoc ; 31(6): 1280-1290, 2024 May 20.
Article in English | MEDLINE | ID: mdl-38573195

ABSTRACT

OBJECTIVE: To develop and validate a natural language processing (NLP) pipeline that detects 18 conditions in French clinical notes, including 16 comorbidities of the Charlson index, while exploring a collaborative and privacy-enhancing workflow. MATERIALS AND METHODS: The detection pipeline relied both on rule-based and machine learning algorithms, respectively, for named entity recognition and entity qualification, respectively. We used a large language model pre-trained on millions of clinical notes along with annotated clinical notes in the context of 3 cohort studies related to oncology, cardiology, and rheumatology. The overall workflow was conceived to foster collaboration between studies while respecting the privacy constraints of the data warehouse. We estimated the added values of the advanced technologies and of the collaborative setting. RESULTS: The pipeline reached macro-averaged F1-score positive predictive value, sensitivity, and specificity of 95.7 (95%CI 94.5-96.3), 95.4 (95%CI 94.0-96.3), 96.0 (95%CI 94.0-96.7), and 99.2 (95%CI 99.0-99.4), respectively. F1-scores were superior to those observed using alternative technologies or non-collaborative settings. The models were shared through a secured registry. CONCLUSIONS: We demonstrated that a community of investigators working on a common clinical data warehouse could efficiently and securely collaborate to develop, validate and use sensitive artificial intelligence models. In particular, we provided an efficient and robust NLP pipeline that detects conditions mentioned in clinical notes.


Subject(s)
Electronic Health Records , Machine Learning , Natural Language Processing , Workflow , Humans , Data Warehousing , Algorithms , France , Confidentiality
8.
Npj Ment Health Res ; 3(1): 6, 2024 Feb 14.
Article in English | MEDLINE | ID: mdl-38609541

ABSTRACT

There is an urgent need to monitor the mental health of large populations, especially during crises such as the COVID-19 pandemic, to timely identify the most at-risk subgroups and to design targeted prevention campaigns. We therefore developed and validated surveillance indicators related to suicidality: the monthly number of hospitalisations caused by suicide attempts and the prevalence among them of five known risks factors. They were automatically computed analysing the electronic health records of fifteen university hospitals of the Paris area, France, using natural language processing algorithms based on artificial intelligence. We evaluated the relevance of these indicators conducting a retrospective cohort study. Considering 2,911,920 records contained in a common data warehouse, we tested for changes after the pandemic outbreak in the slope of the monthly number of suicide attempts by conducting an interrupted time-series analysis. We segmented the assessment time in two sub-periods: before (August 1, 2017, to February 29, 2020) and during (March 1, 2020, to June 31, 2022) the COVID-19 pandemic. We detected 14,023 hospitalisations caused by suicide attempts. Their monthly number accelerated after the COVID-19 outbreak with an estimated trend variation reaching 3.7 (95%CI 2.1-5.3), mainly driven by an increase among girls aged 8-17 (trend variation 1.8, 95%CI 1.2-2.5). After the pandemic outbreak, acts of domestic, physical and sexual violence were more often reported (prevalence ratios: 1.3, 95%CI 1.16-1.48; 1.3, 95%CI 1.10-1.64 and 1.7, 95%CI 1.48-1.98), fewer patients died (p = 0.007) and stays were shorter (p < 0.001). Our study demonstrates that textual clinical data collected in multiple hospitals can be jointly analysed to compute timely indicators describing mental health conditions of populations. Our findings also highlight the need to better take into account the violence imposed on women, especially at early ages and in the aftermath of the COVID-19 pandemic.

9.
JMIR Med Inform ; 12: e49607, 2024 Apr 04.
Article in English | MEDLINE | ID: mdl-38596859

ABSTRACT

Background: Biomedical natural language processing tasks are best performed with English models, and translation tools have undergone major improvements. On the other hand, building annotated biomedical data sets remains a challenge. Objective: The aim of our study is to determine whether the use of English tools to extract and normalize French medical concepts based on translations provides comparable performance to that of French models trained on a set of annotated French clinical notes. Methods: We compared 2 methods: 1 involving French-language models and 1 involving English-language models. For the native French method, the named entity recognition and normalization steps were performed separately. For the translated English method, after the first translation step, we compared a 2-step method and a terminology-oriented method that performs extraction and normalization at the same time. We used French, English, and bilingual annotated data sets to evaluate all stages (named entity recognition, normalization, and translation) of our algorithms. Results: The native French method outperformed the translated English method, with an overall F1-score of 0.51 (95% CI 0.47-0.55), compared with 0.39 (95% CI 0.34-0.44) and 0.38 (95% CI 0.36-0.40) for the 2 English methods tested. Conclusions: Despite recent improvements in translation models, there is a significant difference in performance between the 2 approaches in favor of the native French method, which is more effective on French medical texts, even with few annotated documents.

10.
Methods Inf Med ; 2024 Mar 05.
Article in English | MEDLINE | ID: mdl-38442906

ABSTRACT

OBJECTIVE: The objective of this study is to address the critical issue of deidentification of clinical reports to allow access to data for research purposes, while ensuring patient privacy. The study highlights the difficulties faced in sharing tools and resources in this domain and presents the experience of the Greater Paris University Hospitals (AP-HP for Assistance Publique-Hôpitaux de Paris) in implementing a systematic pseudonymization of text documents from its Clinical Data Warehouse. METHODS: We annotated a corpus of clinical documents according to 12 types of identifying entities and built a hybrid system, merging the results of a deep learning model as well as manual rules. RESULTS AND DISCUSSION: Our results show an overall performance of 0.99 of F1-score. We discuss implementation choices and present experiments to better understand the effort involved in such a task, including dataset size, document types, language models, or rule addition. We share guidelines and code under a 3-Clause BSD license.

11.
Int Wound J ; 21(1): e14556, 2024 Jan.
Article in English | MEDLINE | ID: mdl-38272802

ABSTRACT

Diabetic foot ulcers can have vital consequences, such as amputation for patients. The primary purpose of this study is to predict the amputation risk of diabetic foot patients using machine-learning classification algorithms. In this research, 407 patients treated with the diagnosis of diabetic foot between January 2009-September 2019 in Istanbul University Faculty of Medicine in the Department of Undersea and Hyperbaric Medicine were retrospectively evaluated. Principal Component Analysis (PCA) was used to identify the key features associated with the amputation risk in diabetic foot patients within the dataset. Thus, various prediction/classification models were created to predict the "overall" risk of diabetic foot patients. Predictive machine-learning models were created using various algorithms. Additionally to optimize the hyperparameters of the Random Forest Algorithm (RF), experimental use of Bayesian Optimization (BO) has been employed. The sub-dimension data set comprising categorical and numerical values was subjected to a feature selection procedure. Among all the algorithms tested under the defined experimental conditions, the BO-optimized "RF" based on the hybrid approach (PCA-RF-BO) and "Logistic Regression" algorithms demonstrated superior performance with 85% and 90% test accuracies, respectively. In conclusion, our findings would serve as an essential benchmark, offering valuable guidance in reducing such hazards.


Subject(s)
Diabetes Mellitus , Diabetic Foot , Humans , Diabetic Foot/surgery , Diabetic Foot/diagnosis , Retrospective Studies , Bayes Theorem , Algorithms , Amputation, Surgical
12.
Acta Obstet Gynecol Scand ; 103(3): 479-487, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38059396

ABSTRACT

INTRODUCTION: Since the 1970s, fetal scalp blood sampling (FSBS) has been used as a second-line test of the acid-base status of the fetus to evaluate fetal well-being during labor. The commonly employed thresholds that delineate normal pH (>7.25), subnormal (7.20-7.25), and pathological pH (<7.20) guide clinical decisions. However, these experienced-based thresholds, based on observations and common sense, have yet to be confirmed. The aim of the study was to investigate if pH drop rate accelerates at the common thresholds (7.25 and 7.20) and to explore the possibility of identifying more accurate thresholds. MATERIAL AND METHODS: A retrospective study was conducted at a tertiary maternity hospital between June 2017 and July 2021. Patients with at least one FSBS during labor for category II fetal heart rate and delivery of a singleton cephalic infant were included. The rate of change in pH value between consecutive samples for each patient was calculated and plotted as a function of pH value. Linear regression models were used to model the evolution of the pH drop rate estimating slope and standard errors across predefined pH intervals. Exploration of alternative pH action thresholds was conducted. To explore the independence of the association between pH value and pH drop rate, multiple linear regression adjusted on age, body mass index, parity, oxytocin stimulation and suspected small for gestational age was performed. RESULTS: We included 2047 patients with at least one FSBS (total FSBS 3467); with 2047 umbilical cord blood pH, and a total of 5514 pH samples. Median pH values were 7.29 1 h before delivery, 7.26 30 min before delivery. The pH drop was slow between 7.40 and 7.30, then became more pronounced, with median rates of 0.0005 units/min at 7.25 and 0.0013 units/min at 7.20. Out of the alternative pH thresholds, 7.26 and 7.20 demonstrated the best alignment with our dataset. Multiple linear regression revealed that only pH value was significantly associated to the rate of pH change. CONCLUSIONS: Our study confirms the validity and reliability of current guideline thresholds for fetal scalp pH in category II fetal heart rate.


Subject(s)
Labor, Obstetric , Scalp , Pregnancy , Humans , Female , Retrospective Studies , Reproducibility of Results , Labor, Obstetric/physiology , Fetus , Fetal Blood , Heart Rate, Fetal/physiology , Hydrogen-Ion Concentration , Fetal Monitoring
13.
Med Mycol ; 62(1)2024 Jan 09.
Article in English | MEDLINE | ID: mdl-38142226

ABSTRACT

Aspergillosis of the newborn remains a rare but severe disease. We report four cases of primary cutaneous Aspergillus flavus infections in premature newborns linked to incubators contamination by putative clonal strains. Our objective was to evaluate the ability of matrix-assisted laser desorption/ionisation time of flight (MALDI-TOF) coupled to convolutional neural network (CNN) for clone recognition in a context where only a very small number of strains are available for machine learning. Clinical and environmental A. flavus isolates (n = 64) were studied, 15 were epidemiologically related to the four cases. All strains were typed using microsatellite length polymorphism. We found a common genotype for 9/15 related strains. The isolates of this common genotype were selected to obtain a training dataset (6 clonal isolates/25 non-clonal) and a test dataset (3 clonal isolates/31 non-clonal), and spectra were analysed with a simple CNN model. On the test dataset using CNN model, all 31 non-clonal isolates were correctly classified, 2/3 clonal isolates were unambiguously correctly classified, whereas the third strain was undetermined (i.e., the CNN model was unable to discriminate between GT8 and non-GT8). Clonal strains of A. flavus have persisted in the neonatal intensive care unit for several years. Indeed, two strains of A. flavus isolated from incubators in September 2007 are identical to the strain responsible for the second case that occurred 3 years later. MALDI-TOF is a promising tool for detecting clonal isolates of A. flavus using CNN even with a limited training set for limited cost and handling time.


Cutaneous aspergillosis is a rare but potentially fatal disease of the prematurely born infant. We described here several cases due to Aspergillus flavus and have linked them to environnemental strains using MLP genotyping and MALDI-TOF mass spectrometry coupled with artificial intelligence.


Subject(s)
Aspergillosis , Cross Infection , Animals , Aspergillus flavus/genetics , Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization/methods , Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization/veterinary , Cross Infection/veterinary , Intensive Care Units, Neonatal , Aspergillosis/diagnosis , Aspergillosis/veterinary
14.
Yearb Med Inform ; 32(1): 146-151, 2023 Aug.
Article in English | MEDLINE | ID: mdl-38147857

ABSTRACT

OBJECTIVES: To summarize key contributions to current research in the field of Clinical Research Informatics (CRI) and to select best papers published in 2022. METHOD: A bibliographic search using a combination of Medical Subject Headings (MeSH) descriptors and free-text terms on CRI was performed using PubMed, followed by a double-blind review in order to select a list of candidate best papers to be then peer-reviewed by external reviewers. After peer-review ranking, a consensus meeting between the two section editors and the editorial team was organized to finally conclude on the selected three best papers. RESULTS: Among the 1,324 papers returned by the search, published in 2022, that were in the scope of the various areas of CRI, the full review process selected four best papers. The first best paper describes the process undertaken in Germany, under the national Medical Informatics Initiative, to define a process and to gain multi-decision-maker acceptance of broad consent for the reuse of health data for research whilst remaining compliant with the European General Data Protection Regulation. The authors of the second-best paper present a federated architecture for the conduct of clinical trial feasibility queries that utilizes HL7 Fast Healthcare Interoperability Resources and an HL7 standard query representation. The third best paper aligns with the overall theme of this Yearbook, the inclusivity of potential participants in clinical trials, with recommendations to ensure greater equity. The fourth proposes a multi-modal modelling approach for large scale phenotyping from electronic health record information. This year's survey paper has also examined equity, along with data bias, and found that the relevant publications in 2022 have focused almost exclusively on the issue of bias in Artificial Intelligence (AI). CONCLUSIONS: The literature relevant to CRI in 2022 has largely been dominated by publications that seek to maximise the reusability of wide scale and representative electronic health record information for research, either as big data for distributed analysis or as a source of information from which to identify suitable patients accurately and equitably for invitation to participate in clinical trials.


Subject(s)
Artificial Intelligence , Medical Informatics , Humans , Electronic Health Records , Big Data , Peer Review
15.
Cancer Med ; 12(22): 20918-20929, 2023 11.
Article in English | MEDLINE | ID: mdl-37909210

ABSTRACT

BACKGROUND: The SARS CoV-2 pandemic disrupted healthcare systems. We compared the cancer stage for new breast cancers (BCs) before and during the pandemic. METHODS: We performed a retrospective multicenter cohort study on the data warehouse of Greater Paris University Hospitals (AP-HP). We identified all female patients newly referred with a BC in 2019 and 2020. We assessed the timeline of their care trajectories, initial tumor stage, and treatment received: BC resection, exclusive systemic therapy, exclusive radiation therapy, or exclusive best supportive care (BSC). We calculated patients' 1-year overall survival (OS) and compared indicators in 2019 and 2020. RESULTS: In 2019 and 2020, 2055 and 1988, new BC patients underwent cancer treatment, and during the two lockdowns, the BC diagnoses varied by -18% and by +23% compared to 2019. De novo metastatic tumors (15% and 15%, p = 0.95), pTNM and ypTNM distributions of 1332 cases with upfront resection and of 296 cases with neoadjuvant therapy did not differ (p = 0.37, p = 0.3). The median times from first multidisciplinary meeting and from diagnosis to treatment of 19 days (interquartile 11-39 days) and 35 days (interquartile 22-65 days) did not differ. Access to plastic surgery (15% and 17%, p = 0.08) and to treatment categories did not vary: tumor resection (73% and 72%), exclusive systemic therapy (13% and 14%), exclusive radiation therapy (9% and 9%), exclusive BSC (5% and 5%) (p = 0.8). Among resected patients, the neoadjuvant therapy rate was lower in 2019 (16%) versus 2020 (20%) (p = 0.02). One-year OS rates were 99.3% versus 98.9% (HR = 0.96; 95% CI, 0.77-1.2), 72.6% versus 76.6% (HR = 1.28; 95% CI, 0.95-1.72), 96.6% versus 97.8% (HR = 1.09; 95% CI, 0.61-1.94), and 15.5% versus 15.1% (HR = 0.99; 95% CI, 0.72-1.37), in the treatment groups. CONCLUSIONS: Despite a decrease in the number of new BCs, there was no tumor stage shift, and OS did not vary.


Subject(s)
Breast Neoplasms , COVID-19 , Humans , Female , Breast Neoplasms/diagnosis , Breast Neoplasms/epidemiology , Breast Neoplasms/therapy , Pandemics , Cohort Studies , COVID-19/epidemiology , Communicable Disease Control , Retrospective Studies
16.
Rev Epidemiol Sante Publique ; 71(6): 102189, 2023 Dec.
Article in English | MEDLINE | ID: mdl-37972522

ABSTRACT

OBJECTIVES: Medico-administrative data are promising to automate the calculation of Healthcare Quality and Safety Indicators. Nevertheless, not all relevant indicators can be calculated with this data alone. Our feasibility study objective is to analyze 1) the availability of data sources; 2) the availability of each indicator elementary variables, and 3) to apply natural language processing to automatically retrieve such information. METHOD: We performed a multicenter cross-sectional observational feasibility study on the clinical data warehouse of Assistance Publique - Hôpitaux de Paris (AP-HP). We studied the management of breast cancer patients treated at AP-HP between January 2019 and June 2021, and the quality indicators published by the European Society of Breast Cancer Specialist, using claims data from the Programme de Médicalisation du Système d'Information (PMSI) and pathology reports. For each indicator, we calculated the number (%) of patients for whom all necessary data sources were available, and the number (%) of patients for whom all elementary variables were available in the sources, and for whom the related HQSI was computable. To extract useful data from the free text reports, we developed and validated dedicated rule-based algorithms, whose performance metrics were assessed with recall, precision, and f1-score. RESULTS: Out of 5785 female patients diagnosed with a breast cancer (60.9 years, IQR [50.0-71.9]), 5,147 (89.0%) had procedures related to breast cancer recorded in the PMSI, and 3732 (72.5%) had at least one surgery. Out of the 34 key indicators, 9 could be calculated with the PMSI alone, and 6 others became so using the data from pathology reports. Ten elementary variables were needed to calculate the 6 indicators combining the PMSI and pathology reports. The necessary sources were available for 58.8% to 94.6% of patients, depending on the indicators. The extraction algorithms developed had an average accuracy of 76.5% (min-max [32.7%-93.3%]), an average precision of 77.7% [10.0%-97.4%] and an average sensitivity of 71.6% [2.8% to 100.0%]. Once these algorithms applied, the variables needed to calculate the indicators were extracted for 2% to 88% of patients, depending on the indicators. DISCUSSION: The availability of medical reports in the electronic health records, of the elementary variables within the reports, and the performance of the extraction algorithms limit the population for which the indicators can be calculated. CONCLUSIONS: The automated calculation of quality indicators from electronic health records is a prospect that comes up against many practical obstacles.


Subject(s)
Breast Neoplasms , Female , Humans , Breast Neoplasms/epidemiology , Breast Neoplasms/therapy , Cross-Sectional Studies , Electronic Health Records , Natural Language Processing , Quality Indicators, Health Care
17.
PLOS Digit Health ; 2(9): e0000369, 2023 Sep.
Article in English | MEDLINE | ID: mdl-37773923

ABSTRACT

[This corrects the article DOI: 10.1371/journal.pdig.0000298.].

18.
Int J Cancer ; 153(12): 1988-1996, 2023 12 15.
Article in English | MEDLINE | ID: mdl-37539961

ABSTRACT

The SARS-COV-2 pandemic disrupted healthcare systems. We assessed its impact on the presentation, care trajectories and outcomes of new pancreatic cancers (PCs) in the Paris area. We performed a retrospective multicenter cohort study on the data warehouse of Greater Paris University Hospitals (AP-HP). We identified all patients newly referred with a PC between January 1, 2019, and June 30, 2021, and excluded endocrine tumors. Using claims data and health records, we analyzed the timeline of care trajectories, the initial tumor stage, the treatment categories: pancreatectomy, exclusive systemic therapy or exclusive best supportive care (BSC). We calculated patients' 1-year overall survival (OS) and compared indicators in 2019 and 2020 to 2021. We included 2335 patients. Referral fell by 29% during the first lockdown. The median time from biopsy and from first MDM to treatment were 25 days (16-50) and 21 days (11-40), respectively. Between 2019 and 2020 to 2021, the rate of metastatic tumors (36% vs 33%, P = .39), the pTNM distribution of the 464 cases with upfront tumor resection (P = .80), and the proportion of treatment categories did not vary: tumor resection (32% vs 33%), exclusive systemic therapy (49% vs 49%), exclusive BSC (19% vs 19%). The 1-year OS rates in 2019 vs 2020 to 2021 were 92% vs 89% (aHR = 1.42; 95% CI, 0.82-2.48), 52% vs 56% (aHR = 0.88; 95% CI, 0.73-1.08), 13% vs 10% (aHR = 1.00; 95% CI, 0.78-1.25), in the treatment categories, respectively. Despite an initial decrease in the number of new PCs, we did not observe any stage shift. OS did not vary significantly.


Subject(s)
COVID-19 , Pancreatic Neoplasms , Humans , SARS-CoV-2 , Cohort Studies , COVID-19/epidemiology , Communicable Disease Control , Pancreatic Neoplasms/epidemiology , Pancreatic Neoplasms/therapy , Retrospective Studies , Pancreatic Neoplasms
19.
PLOS Digit Health ; 2(7): e0000298, 2023 Jul.
Article in English | MEDLINE | ID: mdl-37410797

ABSTRACT

Real-world data (RWD) bears great promises to improve the quality of care. However, specific infrastructures and methodologies are required to derive robust knowledge and brings innovations to the patient. Drawing upon the national case study of the 32 French regional and university hospitals governance, we highlight key aspects of modern clinical data warehouses (CDWs): governance, transparency, types of data, data reuse, technical tools, documentation, and data quality control processes. Semi-structured interviews as well as a review of reported studies on French CDWs were conducted in a semi-structured manner from March to November 2022. Out of 32 regional and university hospitals in France, 14 have a CDW in production, 5 are experimenting, 5 have a prospective CDW project, 8 did not have any CDW project at the time of writing. The implementation of CDW in France dates from 2011 and accelerated in the late 2020. From this case study, we draw some general guidelines for CDWs. The actual orientation of CDWs towards research requires efforts in governance stabilization, standardization of data schema, and development in data quality and data documentation. Particular attention must be paid to the sustainability of the warehouse teams and to the multilevel governance. The transparency of the studies and the tools of transformation of the data must improve to allow successful multicentric data reuses as well as innovations in routine care.

20.
JCO Clin Cancer Inform ; 7: e2200179, 2023 05.
Article in English | MEDLINE | ID: mdl-37167578

ABSTRACT

PURPOSE: To compare the computability of Observational Medical Outcomes Partnership (OMOP)-based queries related to prescreening of patients using two versions of the OMOP common data model (CDM; v5.3 and v5.4) and to assess the performance of the Greater Paris University Hospital (APHP) prescreening tool. MATERIALS AND METHODS: We identified the prescreening information items being relevant for prescreening of patients with cancer. We randomly selected 15 academic and industry-sponsored urology phase I-IV clinical trials (CTs) launched at APHP between 2016 and 2021. The computability of the related prescreening criteria (PC) was defined by their translation rate in OMOP-compliant queries and by their execution rate on the APHP clinical data warehouse (CDW) containing data of 205,977 patients with cancer. The overall performance of the prescreening tool was assessed by the rate of true- and false-positive cases of three randomly selected CTs. RESULTS: We defined a list of 15 minimal information items being relevant for patients' prescreening. We identified 83 PC of the 534 eligibility criteria from the 15 CTs. We translated 33 and 62 PC in queries on the basis of OMOP CDM v5.3 and v5.4, respectively (translation rates of 40% and 75%, respectively). Of the 33 PC translated in the v5.3 of the OMOP CDM, 19 could be executed on the APHP CDW (execution rate of 58%). Of 83 PC, the computability rate on the APHP CDW reached 23%. On the basis of three CTs, we identified 17, 32, and 63 patients as being potentially eligible for inclusion in those CTs, resulting in positive predictive values of 53%, 41%, and 21%, respectively. CONCLUSION: We showed that PC could be formalized according to the OMOP CDM and that the oncology extension increased their translation rate through better representation of cancer natural history.


Subject(s)
Urologic Neoplasms , Urology , Humans , Data Warehousing , Databases, Factual , Urologic Neoplasms/diagnosis , Urologic Neoplasms/therapy
SELECTION OF CITATIONS
SEARCH DETAIL