Search | VHL Regional Portal

AlpaPICO: Extraction of PICO frames from clinical trial documents using LLMs.

Ghosh, Madhusudan; Mukherjee, Shrimon; Ganguly, Asmit; Basuchowdhuri, Partha; Naskar, Sudip Kumar; Ganguly, Debasis.

Methods ; 226: 78-88, 2024 Jun.

Article in English | MEDLINE | ID: mdl-38643910

ABSTRACT

In recent years, there has been a surge in the publication of clinical trial reports, making it challenging to conduct systematic reviews. Automatically extracting Population, Intervention, Comparator, and Outcome (PICO) from clinical trial studies can alleviate the traditionally time-consuming process of manually scrutinizing systematic reviews. Existing approaches of PICO frame extraction involves supervised approach that relies on the existence of manually annotated data points in the form of BIO label tagging. Recent approaches, such as In-Context Learning (ICL), which has been shown to be effective for a number of downstream NLP tasks, require the use of labeled examples. In this work, we adopt ICL strategy by employing the pretrained knowledge of Large Language Models (LLMs), gathered during the pretraining phase of an LLM, to automatically extract the PICO-related terminologies from clinical trial documents in unsupervised set up to bypass the availability of large number of annotated data instances. Additionally, to showcase the highest effectiveness of LLM in oracle scenario where large number of annotated samples are available, we adopt the instruction tuning strategy by employing Low Rank Adaptation (LORA) to conduct the training of gigantic model in low resource environment for the PICO frame extraction task. More specifically, both of the proposed frameworks utilize AlpaCare as base LLM which employs both few-shot in-context learning and instruction tuning techniques to extract PICO-related terms from the clinical trial reports. We applied these approaches to the widely used coarse-grained datasets such as EBM-NLP, EBM-COMET and fine-grained datasets such as EBM-NLPrev and EBM-NLPh. Our empirical results show that our proposed ICL-based framework produces comparable results on all the version of EBM-NLP datasets and the proposed instruction tuned version of our framework produces state-of-the-art results on all the different EBM-NLP datasets. Our project is available at https://github.com/shrimonmuke0202/AlpaPICO.git.

Subject(s)

Clinical Trials as Topic , Natural Language Processing , Humans , Clinical Trials as Topic/methods , Data Mining/methods , Machine Learning

D155Y substitution of SARS-CoV-2 ORF3a weakens binding with Caveolin-1.

Gupta, Suchetana; Mallick, Ditipriya; Banerjee, Kumarjeet; Mukherjee, Shrimon; Sarkar, Soumyadev; Lee, Sonny Tm; Basuchowdhuri, Partha; Jana, Siddhartha S.

Comput Struct Biotechnol J ; 20: 766-778, 2022.

Article in English | MEDLINE | ID: mdl-35126886

ABSTRACT

The clinical manifestation of the recent pandemic COVID-19, caused by the novel SARS-CoV-2 virus, varies from mild to severe respiratory illness. Although environmental, demographic and co-morbidity factors have an impact on the severity of the disease, contribution of the mutations in each of the viral genes towards the degree of severity needs a deeper understanding for designing a better therapeutic approach against COVID-19. Open Reading Frame-3a (ORF3a) protein has been found to be mutated at several positions. In this work, we have studied the effect of one of the most frequently occurring mutants, D155Y of ORF3a protein, found in Indian COVID-19 patients. Using computational simulations we demonstrated that the substitution at 155th changed the amino acids involved in salt bridge formation, hydrogen-bond occupancy, interactome clusters, and the stability of the protein compared with the other substitutions found in Indian patients. Protein-protein docking using HADDOCK analysis revealed that substitution D155Y weakened the binding affinity of ORF3a with caveolin-1 compared with the other substitutions, suggesting its importance in the overall stability of ORF3a-caveolin-1 complex, which may modulate the virulence property of SARS-CoV-2.

ABSTRACT

Subject(s)

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL