Búsqueda | Portal Regional de la BVS

Mostrar: 20 | 50 | 100

Resultados 1 - 20 de 26

Filtrar

BarlowTwins-CXR: enhancing chest X-ray abnormality localization in heterogeneous data with cross-domain self-supervised learning.

Sheng, Haoyue; Ma, Linrui; Samson, Jean-François; Liu, Dianbo.

BMC Med Inform Decis Mak ; 24(1): 126, 2024 May 16.

Artículo en Inglés | MEDLINE | ID: mdl-38755563

RESUMEN

BACKGROUND: Chest X-ray imaging based abnormality localization, essential in diagnosing various diseases, faces significant clinical challenges due to complex interpretations and the growing workload of radiologists. While recent advances in deep learning offer promising solutions, there is still a critical issue of domain inconsistency in cross-domain transfer learning, which hampers the efficiency and accuracy of diagnostic processes. This study aims to address the domain inconsistency problem and improve autonomic abnormality localization performance of heterogeneous chest X-ray image analysis, particularly in detecting abnormalities, by developing a self-supervised learning strategy called "BarlwoTwins-CXR". METHODS: We utilized two publicly available datasets: the NIH Chest X-ray Dataset and the VinDr-CXR. The BarlowTwins-CXR approach was conducted in a two-stage training process. Initially, self-supervised pre-training was performed using an adjusted Barlow Twins algorithm on the NIH dataset with a Resnet50 backbone pre-trained on ImageNet. This was followed by supervised fine-tuning on the VinDr-CXR dataset using Faster R-CNN with Feature Pyramid Network (FPN). The study employed mean Average Precision (mAP) at an Intersection over Union (IoU) of 50% and Area Under the Curve (AUC) for performance evaluation. RESULTS: Our experiments showed a significant improvement in model performance with BarlowTwins-CXR. The approach achieved a 3% increase in mAP50 accuracy compared to traditional ImageNet pre-trained models. In addition, the Ablation CAM method revealed enhanced precision in localizing chest abnormalities. The study involved 112,120 images from the NIH dataset and 18,000 images from the VinDr-CXR dataset, indicating robust training and testing samples. CONCLUSION: BarlowTwins-CXR significantly enhances the efficiency and accuracy of chest X-ray image-based abnormality localization, outperforming traditional transfer learning methods and effectively overcoming domain inconsistency in cross-domain scenarios. Our experiment results demonstrate the potential of using self-supervised learning to improve the generalizability of models in medical settings with limited amounts of heterogeneous data. This approach can be instrumental in aiding radiologists, particularly in high-workload environments, offering a promising direction for future AI-driven healthcare solutions.

Asunto(s)

Radiografía Torácica , Aprendizaje Automático Supervisado , Humanos , Aprendizaje Profundo , Interpretación de Imagen Radiográfica Asistida por Computador/métodos , Conjuntos de Datos como Asunto

Natural Language Processing Methods to Empirically Explore Social Contexts and Needs in Cancer Patient Notes.

Derton, Abigail; Guevara, Marco; Chen, Shan; Moningi, Shalini; Kozono, David E; Liu, Dianbo; Miller, Timothy A; Savova, Guergana K; Mak, Raymond H; Bitterman, Danielle S.

JCO Clin Cancer Inform ; 7: e2200196, 2023 05.

Artículo en Inglés | MEDLINE | ID: mdl-37235847

RESUMEN

PURPOSE: There is an unmet need to empirically explore and understand drivers of cancer disparities, particularly social determinants of health. We explored natural language processing methods to automatically and empirically extract clinical documentation of social contexts and needs that may underlie disparities. METHODS: This was a retrospective analysis of 230,325 clinical notes from 5,285 patients treated with radiotherapy from 2007 to 2019. We compared linguistic features among White versus non-White, low-income insurance versus other insurance, and male versus female patients' notes. Log odds ratios with an informative Dirichlet prior were calculated to compare words over-represented in each group. A variational autoencoder topic model was applied, and topic probability was compared between groups. The presence of machine-learnable bias was explored by developing statistical and neural demographic group classifiers. RESULTS: Terms associated with varied social contexts and needs were identified for all demographic group comparisons. For example, notes of non-White and low-income insurance patients were over-represented with terms associated with housing and transportation, whereas notes of White and other insurance patients were over-represented with terms related to physical activity. Topic models identified a social history topic, and topic probability varied significantly between the demographic group comparisons. Classification models performed poorly at classifying notes of non-White and low-income insurance patients (F1 of 0.30 and 0.23, respectively). CONCLUSION: Exploration of linguistic differences in clinical notes between patients of different race/ethnicity, insurance status, and sex identified social contexts and needs in patients with cancer and revealed high-level differences in notes. Future work is needed to validate whether these findings may play a role in cancer disparities.

Asunto(s)

Procesamiento de Lenguaje Natural , Neoplasias , Humanos , Masculino , Femenino , Estudios Retrospectivos , Medio Social , Neoplasias/diagnóstico , Neoplasias/epidemiología , Neoplasias/terapia

Spatio-temporal heterogeneity in the international trade resilience during COVID-19.

Luo, Wei; He, Lingfeng; Yang, Zihui; Zhang, Shirui; Wang, Yong; Liu, Dianbo; Hu, Sheng; He, Li; Xia, Jizhe; Chen, Min.

Appl Geogr ; 154: 102923, 2023 May.

Artículo en Inglés | MEDLINE | ID: mdl-36915293

RESUMEN

The COVID-19 pandemic and subsequent lockdowns have created immeasurable health and economic crises, leading to unprecedented disruptions to world trade. The COVID-19 pandemic shows diverse impacts on different economies that suffer and recover at different rates and degrees. This research aims to evaluate the spatio-temporal heterogeneity of international trade network vulnerabilities in the current crisis to understand the global production resilience and prepare for the future crisis. We applied a series of complex network analysis approaches to the monthly international trade networks at the world, regional, and country scales for the pre- and post- COVID-19 outbreak period. The spatio-temporal patterns indicate that countries and regions with an effective COVID-19 containment such as East Asia show the strongest resilience, especially Mainland China, followed by high-income countries with fast vaccine roll-out (e.g., U.S.), whereas low-income countries (e.g., Africa) show high vulnerability. Our results encourage a comprehensive strategy to enhance international trade resilience when facing future pandemic threats including effective non-pharmaceutical measures, timely development and rollout of vaccines, strong governance capacity, robust healthcare systems, and equality via international cooperation. The overall findings elicit the hidden global trading disruption, recovery, and growth due to the adverse impact of the COVID-19 pandemic.

Confederated learning in healthcare: Training machine learning models using disconnected data separated by individual, data type and identity for Large-Scale health system Intelligence.

Liu, Dianbo; Fox, Kathe; Weber, Griffin; Miller, Tim.

J Biomed Inform ; 134: 104151, 2022 10.

Artículo en Inglés | MEDLINE | ID: mdl-35872264

RESUMEN

BACKGROUND: A patient's health information is generally fragmented across silos because it follows how care is delivered: multiple providers in multiple settings. Though it is technically feasible to reunite data for analysis in a manner that underpins a rapid learning healthcare system, privacy concerns and regulatory barriers limit data centralization for this purpose. OBJECTIVES: Machine learning can be conducted in a federated manner on patient datasets with the same set of variables but separated across storage. But federated learning cannot handle the situation where different data types for a given patient are separated vertically across different organizations and when patient ID matching across different institutions is difficult. We call methods that enable machine learning model training on data separated by two or more dimensions "confederated machine learning", which we aim to develop in this study. METHODS: We propose and evaluate confederated learning for training machine learning models to stratify the risk of several diseases among silos when data are horizontally separated by individual, vertically separated by data type, and separated by identity without patient ID matching. The confederated learning method can be intuitively understood as a distributed learning method with representation learning, generative model, imputation method and data augmentation elements. RESULTS: Our confederated learning method achieves AUCROC (Area Under The Curve Receiver Operating Characteristics) of 0.787 for diabetes prediction, 0.718 for psychological disorders prediction, and 0.698 for Ischemic heart disease prediction using nationwide health insurance claims. CONCLUSION: Our proposed confederated learning method successfully trained machine learning models on health insurance data separated by two or more dimensions.

Asunto(s)

Atención a la Salud , Aprendizaje Automático , Humanos , Inteligencia , Privacidad , Curva ROC

Machine learning approaches to predicting no-shows in pediatric medical appointment.

Liu, Dianbo; Shin, Won-Yong; Sprecher, Eli; Conroy, Kathleen; Santiago, Omar; Wachtel, Gal; Santillana, Mauricio.

NPJ Digit Med ; 5(1): 50, 2022 Apr 20.

Artículo en Inglés | MEDLINE | ID: mdl-35444260

RESUMEN

Patients' no-shows, scheduled but unattended medical appointments, have a direct negative impact on patients' health, due to discontinuity of treatment and late presentation to care. They also lead to inefficient use of medical resources in hospitals and clinics. The ability to predict a likely no-show in advance could enable the design and implementation of interventions to reduce the risk of it happening, thus improving patients' care and clinical resource allocation. In this study, we develop a new interpretable deep learning-based approach for predicting the risk of no-shows at the time when a medical appointment is first scheduled. The retrospective study was conducted in an academic pediatric teaching hospital with a 20% no-show rate. Our approach tackles several challenges in the design of a predictive model by (1) adopting a data imputation method for patients with missing information in their records (77% of the population), (2) exploiting local weather information to improve predictive accuracy, and (3) developing an interpretable approach that explains how a prediction is made for each individual patient. Our proposed neural network-based and logistic regression-based methods outperformed persistence baselines. In an unobserved set of patients, our method correctly identified 83% of no-shows at the time of scheduling and led to a false alert rate less than 17%. Our method is capable of producing meaningful predictions even when some information in a patient's records is missing. We find that patients' past no-show record is the strongest predictor. Finally, we discuss several potential interventions to reduce no-shows, such as scheduling appointments of high-risk patients at off-peak times, which can serve as starting point for further studies on no-show interventions.

Using Artificial Neural Network Condensation to Facilitate Adaptation of Machine Learning in Medical Settings by Reducing Computational Burden: Model Design and Evaluation Study.

Liu, Dianbo; Zheng, Ming; Sepulveda, Nestor Andres.

JMIR Form Res ; 5(12): e20767, 2021 Dec 08.

Artículo en Inglés | MEDLINE | ID: mdl-34889747

RESUMEN

BACKGROUND: Machine learning applications in the health care domain can have a great impact on people's lives. At the same time, medical data is usually big, requiring a significant number of computational resources. Although this might not be a problem for the wide adoption of machine learning tools in high-income countries, the availability of computational resources can be limited in low-income countries and on mobile devices. This can limit many people from benefiting from the advancement in machine learning applications in the field of health care. OBJECTIVE: In this study, we explore three methods to increase the computational efficiency and reduce model sizes of either recurrent neural networks (RNNs) or feedforward deep neural networks (DNNs) without compromising their accuracy. METHODS: We used inpatient mortality prediction as our case analysis upon review of an intensive care unit dataset. We reduced the size of RNN and DNN by applying pruning of "unused" neurons. Additionally, we modified the RNN structure by adding a hidden layer to the RNN cell but reducing the total number of recurrent layers to accomplish a reduction of the total parameters used in the network. Finally, we implemented quantization on DNN by forcing the weights to be 8 bits instead of 32 bits. RESULTS: We found that all methods increased implementation efficiency, including training speed, memory size, and inference speed, without reducing the accuracy of mortality prediction. CONCLUSIONS: Our findings suggest that neural network condensation allows for the implementation of sophisticated neural network algorithms on devices with lower computational resources.

High-throughput 5' UTR engineering for enhanced protein production in non-viral gene therapies.

Cao, Jicong; Novoa, Eva Maria; Zhang, Zhizhuo; Chen, William C W; Liu, Dianbo; Choi, Gigi C G; Wong, Alan S L; Wehrspaun, Claudia; Kellis, Manolis; Lu, Timothy K.

Nat Commun ; 12(1): 4138, 2021 07 06.

Artículo en Inglés | MEDLINE | ID: mdl-34230498

RESUMEN

Despite significant clinical progress in cell and gene therapies, maximizing protein expression in order to enhance potency remains a major technical challenge. Here, we develop a high-throughput strategy to design, screen, and optimize 5' UTRs that enhance protein expression from a strong human cytomegalovirus (CMV) promoter. We first identify naturally occurring 5' UTRs with high translation efficiencies and use this information with in silico genetic algorithms to generate synthetic 5' UTRs. A total of ~12,000 5' UTRs are then screened using a recombinase-mediated integration strategy that greatly enhances the sensitivity of high-throughput screens by eliminating copy number and position effects that limit lentiviral approaches. Using this approach, we identify three synthetic 5' UTRs that outperform commonly used non-viral gene therapy plasmids in expressing protein payloads. In summary, we demonstrate that high-throughput screening of 5' UTR libraries with recombinase-mediated integration can identify genetic elements that enhance protein expression, which should have numerous applications for engineered cell and gene therapies.

Asunto(s)

Regiones no Traducidas 5'/genética , Ingeniería Genética , Terapia Genética , Algoritmos , Línea Celular , Expresión Génica , Células HEK293 , Ensayos Analíticos de Alto Rendimiento , Humanos , Plásmidos , Regiones Promotoras Genéticas , Recombinasas

FeARH: Federated machine learning with anonymous random hybridization on electronic medical records.

Cui, Jianfei; Zhu, He; Deng, Hao; Chen, Ziwei; Liu, Dianbo.

J Biomed Inform ; 117: 103735, 2021 05.

Artículo en Inglés | MEDLINE | ID: mdl-33711540

RESUMEN

Electrical medical records are restricted and difficult to centralize for machine learning model training due to privacy and regulatory issues. One solution is to train models in a distributed manner that involves many parties in the process. However, sometimes certain parties are not trustable, and in this project, we aim to propose an alternative method to traditional federated learning with central analyzer in order to conduct training in a situation without a trustable central analyzer. The proposed algorithm is called "federated machine learning with anonymous random hybridization (abbreviated as 'FeARH')", using mainly hybridization algorithm to degenerate the integration of connections between medical record data and models' parameters by adding randomization into the parameter sets shared to other parties. Based on our experiment, our new algorithm has similar AUCROC and AUCPR results compared with machine learning in a centralized manner and original federated machine learning.

Asunto(s)

Registros Electrónicos de Salud , Aprendizaje Automático , Algoritmos , Privacidad , Proyectos de Investigación

Patients dispensed medications with actionable pharmacogenomic biomarkers: rates and characteristics.

Liu, Dianbo; Olson, Karen L; Manzi, Shannon F; Mandl, Kenneth D.

Genet Med ; 23(4): 782-786, 2021 04.

Artículo en Inglés | MEDLINE | ID: mdl-33420348

RESUMEN

PURPOSE: Pharmacogenomic biomarkers are increasingly listed on medication labels and authoritative guidelines but pharmacogenomic-guided prescribing is not yet common. Our objective was to assess the potential for incorporating knowledge of patients' genomic characteristics into prescribing practices. METHODS: We performed a retrospective analysis of claims data for 2,096,971 beneficiaries with pharmacy coverage from a national, commercial health insurance plan between January 2017 and December 2019. Children between 0 and 17 years comprised 21% of the cohort. Adults were age 18 to 64. Medications with actionable pharmacogenomic biomarkers (MAPBs) were identified using public information from the US Food and Drug Administration (FDA), Clinical Pharmacogenomics Implementation Consortium (CPIC), and PharmGKB. RESULTS: MAPBs were dispensed to 63% of the adults and 29% of the children in the cohort. Most frequently dispensed were ibuprofen, ondansetron, codeine, and oxycodone. Most common were medications with CYP2D6, G6PD, or CYPC19 pharmacogenomic biomarkers. Ten percent of the cohort were codispensed more than one MAPB for at least 30 days. CONCLUSION: The number of people who might benefit from pharmacogenomic-guided prescribing is substantial. Future work should address obstacles to integrating genomic data into prescriber workflows, complex factors contributing to the magnitude of benefit, and the clinical availability of reliable on-demand or pre-emptive pharmacogenomic testing.

Asunto(s)

Farmacogenética , Pruebas de Farmacogenómica , Adolescente , Adulto , Biomarcadores , Niño , Etiquetado de Medicamentos , Humanos , Persona de Mediana Edad , Estudios Retrospectivos , Adulto Joven

10.

Stochastic Channel-Based Federated Learning With Neural Network Pruning for Medical Data Privacy Preservation: Model Development and Experimental Validation.

Shao, Rulin; He, Hongyu; Chen, Ziwei; Liu, Hui; Liu, Dianbo.

JMIR Form Res ; 4(12): e17265, 2020 Dec 22.

Artículo en Inglés | MEDLINE | ID: mdl-33350391

RESUMEN

BACKGROUND: Artificial neural networks have achieved unprecedented success in the medical domain. This success depends on the availability of massive and representative datasets. However, data collection is often prevented by privacy concerns, and people want to take control over their sensitive information during both the training and using processes. OBJECTIVE: To address security and privacy issues, we propose a privacy-preserving method for the analysis of distributed medical data. The proposed method, termed stochastic channel-based federated learning (SCBFL), enables participants to train a high-performance model cooperatively and in a distributed manner without sharing their inputs. METHODS: We designed, implemented, and evaluated a channel-based update algorithm for a central server in a distributed system. The update algorithm will select the channels with regard to the most active features in a training loop, and then upload them as learned information from local datasets. A pruning process, which serves as a model accelerator, was further applied to the algorithm based on the validation set. RESULTS: We constructed a distributed system consisting of 5 clients and 1 server. Our trials showed that the SCBFL method can achieve an area under the receiver operating characteristic curve (AUC-ROC) of 0.9776 and an area under the precision-recall curve (AUC-PR) of 0.9695 with only 10% of channels shared with the server. Compared with the federated averaging algorithm, the proposed SCBFL method achieved a 0.05388 higher AUC-ROC and 0.09695 higher AUC-PR. In addition, our experiment showed that 57% of the time is saved by the pruning process with only a reduction of 0.0047 in AUC-ROC performance and a reduction of 0.0068 in AUC-PR performance. CONCLUSIONS: In this experiment, our model demonstrated better performance and a higher saturating speed than the federated averaging method, which reveals all of the parameters of local models to the server. The saturation rate of performance could be promoted by introducing a pruning process and further improvement could be achieved by tuning the pruning rate.

Ver mas detalles

ENVIAR RESULTADO:

Exportar

Imprimir

RSS

XML

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA