Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 44
Filtrar
1.
JCO Precis Oncol ; 8: e2300718, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38976829

RESUMO

PURPOSE: To use modern machine learning approaches to enhance and automate the feature extraction from the longitudinal circulating tumor DNA (ctDNA) data and to improve the prediction of survival and disease progression, risk stratification, and treatment strategies for patients with 1L non-small cell lung cancer (NSCLC). MATERIALS AND METHODS: Using IMpower150 trial data on patients with untreated metastatic NSCLC treated with atezolizumab and chemotherapies, we developed a machine learning algorithm to extract predictive features from ctDNA kinetics, improving survival and progression prediction. We analyzed kinetic data from 17 ctDNA summary markers, including cell-free DNA concentration, allele frequency, tumor molecules in plasma, and mutation counts. RESULTS: Three hundred and ninety-eight patients with ctDNA data (206 in training and 192 in validation) were analyzed. Our models outperformed existing workflow using conventional temporal ctDNA features, raising overall survival (OS) concordance index to 0.72 and 0.71 from 0.67 and 0.63 for C3D1 and C4D1, respectively, and substantially improving progression-free survival (PFS) to approximately 0.65 from the previous 0.54-0.58, a 12%-20% increase. Additionally, they enhanced risk stratification for patients with NSCLC, achieving clear OS and PFS separation. Distinct patterns of ctDNA kinetic characteristics (eg, baseline ctDNA markers, depth of ctDNA responses, and timing of ctDNA clearance, etc) were revealed across the risk groups. Rapid and complete ctDNA clearance appears essential for long-term clinical benefit. CONCLUSION: Our machine learning approach offers a novel tool for analyzing ctDNA kinetics, extracting critical features from longitudinal data, improving our understanding of the link between ctDNA kinetics and progression/mortality risks, and optimizing personalized immunotherapies for 1L NSCLC.


Assuntos
Carcinoma Pulmonar de Células não Pequenas , DNA Tumoral Circulante , Progressão da Doença , Imunoterapia , Neoplasias Pulmonares , Aprendizado de Máquina , Humanos , Carcinoma Pulmonar de Células não Pequenas/tratamento farmacológico , Carcinoma Pulmonar de Células não Pequenas/genética , Carcinoma Pulmonar de Células não Pequenas/sangue , Carcinoma Pulmonar de Células não Pequenas/mortalidade , Carcinoma Pulmonar de Células não Pequenas/patologia , DNA Tumoral Circulante/sangue , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/sangue , Neoplasias Pulmonares/tratamento farmacológico , Neoplasias Pulmonares/patologia , Neoplasias Pulmonares/mortalidade , Imunoterapia/métodos , Masculino , Feminino , Pessoa de Meia-Idade , Anticorpos Monoclonais Humanizados/uso terapêutico , Idoso , Intervalo Livre de Progressão
2.
Eur J Cancer ; 207: 114147, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38834016

RESUMO

BACKGROUND: We aim to compare the prognostic value of organ-specific dynamics with the sum of the longest diameter (SLD) dynamics in patients with metastatic colorectal cancer (mCRC). METHODS: All datasets are accessible in Project Data Sphere, an open-access platform. The tumor growth inhibition models developed based on organ-level SLD and SLD were used to estimate the organ-specific tumor growth rates (KGs) and SLD KG. The early tumor shrinkage (ETS) from baseline to the first measurement after treatment was also evaluated. The relationship between organ-specific dynamics, SLD dynamics, and survival outcomes (overall survival, OS; progression-free survival, PFS) was quantified using Kaplan-Meier analysis and Cox regression. RESULTS: This study included 3687 patients from 6 phase III mCRC trials. The liver emerged as the most frequent metastatic site (2901, 78.7 %), with variable KGs across different organs in individual patients (liver 0.0243 > lung 0.0202 > lymph node 0.0127 > other 0.0118 [week-1]). Notably, the dynamics for different organs did not equally contribute to predicting survival outcomes. In liver metastasis cases, liver KG proved to be a superior prognostic indicator for OS and surpasses the predictive performance of SLD, (C-index, liver KG 0.610 vs SLD KG 0.606). A similar result can be found for PFS. Moreover, liver ETS also outperforms SLD ETS in predicting survival. Cox regression analysis confirmed liver KG is the most significant variable in survival prediction. CONCLUSIONS: In mCRC patients with liver metastasis, liver dynamics is the primary prognostic indicator for both PFS and OS. In future drug development for mCRC, greater emphasis should be directed towards understanding the dynamics of liver metastasis development.


Assuntos
Neoplasias Colorretais , Humanos , Neoplasias Colorretais/patologia , Neoplasias Colorretais/mortalidade , Masculino , Feminino , Prognóstico , Neoplasias Hepáticas/secundário , Neoplasias Hepáticas/mortalidade , Pessoa de Meia-Idade , Idoso , Intervalo Livre de Progressão , Ensaios Clínicos Fase III como Assunto
3.
JCO Clin Cancer Inform ; 8: e2300154, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38231003

RESUMO

PURPOSE: To apply deep learning algorithms to histopathology images, construct image-based subtypes independent of known clinical and molecular classifications for glioblastoma, and produce novel insights into molecular and immune characteristics of the glioblastoma tumor microenvironment. MATERIALS AND METHODS: Using whole-slide hematoxylin and eosin images from 214 patients with glioblastoma in The Cancer Genome Atlas (TCGA), a fine-tuned convolutional neural network model extracted deep learning features. Biclustering was used to identify subtypes and image feature modules. Prognostic value of image subtypes was assessed via Cox regression on survival outcomes and validated with 189 samples from the Clinical Proteomic Tumor Analysis Consortium (CPTAC) data set. Morphological, molecular, and immune characteristics of glioblastoma image subtypes were analyzed. RESULTS: Four distinct subtypes and modules (imClust1-4) were identified for the TCGA patients with glioblastoma on the basis of the image feature data. The glioblastoma image subtypes were significantly associated with overall survival (OS; P = .028) and progression-free survival (P = .003). Apparent association was also observed for disease-specific survival (P = .096). imClust2 had the best prognosis for all three survival end points (eg, after 25 months, imClust2 had >7% surviving patients than the other subtypes). Examination of OS in the external validation using the unseen CPTAC data set showed consistent patterns. Multivariable Cox analyses confirmed that the image subtypes carry unique prognostic information independent of known clinical and molecular predictors. Molecular and immune profiling revealed distinct immune compositions of the tumor microenvironment in different image subtypes and may provide biologic explanations for the patterns in patients' outcomes. CONCLUSION: Our image-based subtype classification on the basis of deep learning models is a novel tool to refine risk stratification in cancers. The image subtypes detected for glioblastoma represent a promising prognostic biomarker with distinct molecular and immune characteristics and may facilitate developing novel, individualized immunotherapies for glioblastoma.


Assuntos
Produtos Biológicos , Aprendizado Profundo , Glioblastoma , Humanos , Glioblastoma/diagnóstico por imagem , Prognóstico , Proteômica , Microambiente Tumoral
4.
Comput Biol Chem ; 109: 108009, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38219419

RESUMO

Many soft biclustering algorithms have been developed and applied to various biological and biomedical data analyses. However, few mutually exclusive (hard) biclustering algorithms have been proposed, which could better identify disease or molecular subtypes with survival significance based on genomic or transcriptomic data. In this study, we developed a novel mutually exclusive spectral biclustering (MESBC) algorithm based on spectral method to detect mutually exclusive biclusters. MESBC simultaneously detects relevant features (genes) and corresponding conditions (patients) subgroups and, therefore, automatically uses the signature features for each subtype to perform the clustering. Extensive simulations revealed that MESBC provided superior accuracy in detecting pre-specified biclusters compared with the non-negative matrix factorization (NMF) and Dhillon's algorithm, particularly in very noisy data. Further analysis of the algorithm on real datasets obtained from the TCGA database showed that MESBC provided more accurate (i.e., smaller p-value) overall survival prediction in patients with lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) cancers when compared to the existing, gold-standard subtypes for lung cancers (integrative clustering). Furthermore, MESBC detected several genes with significant prognostic value in both LUAD and LUSC patients. External validation on an independent, unseen GEO dataset of LUAD showed that MESBC-derived clusters based on TCGA data still exhibited clear biclustering patterns and consistent, outstanding prognostic predictability, demonstrating robust generalizability of MESBC. Therefore, MESBC could potentially be used as a risk stratification tool to optimize the treatment for the patient, improve the selection of patients for clinical trials, and contribute to the development of novel therapeutic agents.


Assuntos
Adenocarcinoma de Pulmão , Carcinoma Pulmonar de Células não Pequenas , Carcinoma de Células Escamosas , Neoplasias Pulmonares , Humanos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Perfilação da Expressão Gênica/métodos , Algoritmos , Neoplasias Pulmonares/genética
5.
Clin Pharmacol Ther ; 115(4): 805-814, 2024 04.
Artigo em Inglês | MEDLINE | ID: mdl-37724436

RESUMO

Pretreatment serum lactate dehydrogenase (LDH) levels have been associated with poor prognosis in several types of cancer, including metastatic colorectal cancer (mCRC). However, very few models link survival to longitudinal LDH measured repeatedly over time during treatment. We investigated the prognostic value of on-treatment LDH dynamics in mCRC. Using data from two large phase III studies (2L and 3L+ mCRC settings, n = 824 and 210, respectively), we found that integrating longitudinal LDH data with baseline risk factors significantly improved survival prediction. Current LDH values performed best, enhancing discrimination ability (area under the receiver operating characteristic curve) by 4.5~15.4% and prediction accuracy (Brier score) by 3.9~15.0% compared with baseline variables. Combining all longitudinal LDH markers further improved predictive performance. After controlling for baseline covariates and other longitudinal LDH indicators, current LDH levels remained a significant risk factor in mCRC, increasing mortality risk by over 90% (P < 0.001) in 2L patients and 60-70% (P < 0.01) in 3L+ patients per unit increment in current log (LDH). Machine-learning techniques, like functional principal component analysis (FPCA), extracted informative features from longitudinal LDH data, capturing over 99% of variability and allowing prediction of survival. Unsupervised clustering based on the extracted FPCA features stratified patients into three groups with distinct LDH dynamics and survival outcomes. Hence, our approaches offer a valuable and cost-effective way for risk stratification and improves survival prediction in mCRC using LDH trajectories.


Assuntos
Neoplasias Colorretais , L-Lactato Desidrogenase , p-Cloroanfetamina/análogos & derivados , Humanos , Prognóstico , Fatores de Risco , Estudos Retrospectivos
6.
Am J Pathol ; 193(12): 2122-2132, 2023 12.
Artigo em Inglês | MEDLINE | ID: mdl-37775043

RESUMO

In digital pathology tasks, transformers have achieved state-of-the-art results, surpassing convolutional neural networks (CNNs). However, transformers are usually complex and resource intensive. This study developed a novel and efficient digital pathology classifier called DPSeq to predict cancer biomarkers through fine-tuning a sequencer architecture integrating horizontal and vertical bidirectional long short-term memory networks. Using hematoxylin and eosin-stained histopathologic images of colorectal cancer from two international data sets (The Cancer Genome Atlas and Molecular and Cellular Oncology), the predictive performance of DPSeq was evaluated in a series of experiments. DPSeq demonstrated exceptional performance for predicting key biomarkers in colorectal cancer (microsatellite instability status, hypermutation, CpG island methylator phenotype status, BRAF mutation, TP53 mutation, and chromosomal instability), outperforming most published state-of-the-art classifiers in a within-cohort internal validation and a cross-cohort external validation. In addition, under the same experimental conditions using the same set of training and testing data sets, DPSeq surpassed four CNNs (ResNet18, ResNet50, MobileNetV2, and EfficientNet) and two transformer (Vision Transformer and Swin Transformer) models, achieving the highest area under the receiver operating characteristic curve and area under the precision-recall curve values in predicting microsatellite instability status, BRAF mutation, and CpG island methylator phenotype status. Furthermore, DPSeq required less time for both training and prediction because of its simple architecture. Therefore, DPSeq appears to be the preferred choice over transformer and CNN models for predicting cancer biomarkers.


Assuntos
Biomarcadores Tumorais , Neoplasias Colorretais , Humanos , Biomarcadores Tumorais/genética , Proteínas Proto-Oncogênicas B-raf/genética , Instabilidade de Microssatélites , Metilação de DNA/genética , Neoplasias Colorretais/diagnóstico , Neoplasias Colorretais/genética , Neoplasias Colorretais/patologia , Ilhas de CpG/genética
7.
Nutrients ; 15(9)2023 May 04.
Artigo em Inglês | MEDLINE | ID: mdl-37432361

RESUMO

Several studies have demonstrated that adhering to the Dietary Approaches to Stop Hypertension (DASH) diet may result in decreased blood pressure levels and hypertension risk. This may be an effect of a reduction in central obesity. In the current study, we explored the mediation role of multiple anthropometric measurements in association with DASH score and hypertension risk, and we investigated potential common micro/macro nutrients that react with the obesity-reduction mechanism. Our study used data from the National Health and Nutrition Examination Survey (NHANES). Important demographic variables, such as gender, race, age, marital status, education attainment, poverty income ratio, and lifestyle habits such as smoking, alcohol drinking, and physical activity were collected. Various anthropometric measurements, including weight, waist circumference, body mass index (BMI), and waist-to-height ratio (WHtR) were also obtained from the official website. The nutrient intake of 8224 adults was quantified through a combination of interviews and laboratory tests. We conducted stepwise regression to filter the most important anthropometric measurements and performed a multiple mediation analysis to test whether the selected anthropometric measurements had mediation effects on the total effect of the DASH diet on hypertension. Random forest models were conducted to identify nutrient subsets associated with the DASH score and anthropometric measurements. Finally, associations between common nutrients and DASH score, anthropometric measurements, and risk of hypertension were respectively evaluated by a logistic regression model adjusting for possible confounders. Our study revealed that BMI and WHtR acted as full mediators between DASH score and high blood pressure levels. Together, they accounted for more than 45% of the variation in hypertension. Interestingly, WHtR was found to be the strongest mediator, explaining approximate 80% of the mediating effect. Furthermore, we identified a group of three commonly consumed nutrients (sodium, potassium, and octadecatrienoic acid) that had opposing effects on DASH score and anthropometric measurements. These nutrients were also found to be associated with hypertension in the same way as BMI and WHtR in univariate regression models. The most important among these nutrients was sodium, which was negatively correlated with the DASH score (ß = -0.53, 95% CI = -0.56~-0.50, p < 0.001) and had a positive association with BMI (ß = 0.04, 95% CI = 0.01~0.07, p = 0.02), WHtR (ß = 0.06, 95% CI = 0.03~0.09, p < 0.001), and hypertension (OR = 1.09, 95% CI = 1.01~1.19, p = 0.037). Our investigation revealed that the WHtR exerts a greater mediating effect than BMI on the correlation between the DASH diet and hypertension. Notably, we identified a plausible nutrient intake pathway involving sodium, potassium, and octadecatrienoic acid. Our findings suggested that lifestyle modifications that emphasize the reduction of central obesity and the attainment of a well-balanced micro/macro nutrient profile, such as the DASH diet, could potentially be efficacious in managing hypertension.


Assuntos
Abordagens Dietéticas para Conter a Hipertensão , Hipertensão , Adulto , Humanos , Inquéritos Nutricionais , Obesidade Abdominal/epidemiologia , Dieta , Ingestão de Alimentos , Hipertensão/epidemiologia , Obesidade/epidemiologia , Sódio
8.
Clin Pharmacokinet ; 62(5): 705-713, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-36930421

RESUMO

BACKGROUND AND OBJECTIVE: The designs of first-in-human (FIH) studies in oncology (e.g., 3 + 3 dose escalation design) usually do not provide a sufficient sample size to determine the dose-response relationship for efficacy. This study aimed to assess the feasibility of using monoclonal antibody (mAb) clearance as a biomarker for efficacy to facilitate the identification of potentially efficacious doses across cancer types and drug targets. METHODS: We performed electronic searches of the Drugs@FDA website, the European Medicines Agency website, and PubMed to identify reports of FIH trials of approved mAbs in oncology. The clearance, half-life, and overall response rate (ORR) data for the mAbs at different dose levels were extracted. RESULTS: Twenty-five approved mAbs were included in this study. As expected, due to the small sample sizes in FIH studies, there was no clear dose-response for ORR. However, we found a clear negative association between mAb clearance and ORR across tumors/drug targets, and a clear negative dose-clearance relationship, with clearance decreasing and saturated at high dose levels. The approved mAb doses (1-25 mg/kg) are approximately 2-fold the saturation doses (1-10 mg/kg). The associated clearance values at the approved doses vary across different cancers and drug targets (0.17-1.56 L/day), while tend to be similar within a disease/drug target. Anti-CD20 mAbs for B-cell lymphomas show a higher clearance (~ 1 L/day) than other cancers and targets (e.g., ~ 0.3 L/day for anti-PD-1). CONCLUSIONS: Clearance of mAbs can be a tumor/drug target-agnostic biomarker for potential anti-tumor activity as clearance decreases with increasing ORR. Our findings shed important insights into target clearance values that may lead to desired efficacy for different cancers and drug targets, which can be used to guide dose selection for the future development of mAbs during FIH oncology studies.


Assuntos
Anticorpos Monoclonais , Neoplasias , Humanos , Anticorpos Monoclonais/uso terapêutico , Neoplasias/tratamento farmacológico , Meia-Vida , Biomarcadores Tumorais
9.
Comput Med Imaging Graph ; 105: 102189, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-36739752

RESUMO

Self-attention mechanism-based algorithms are attractive in digital pathology due to their interpretability, but suffer from computation complexity. This paper presents a novel, lightweight Attention-based Multiple Instance Mutation Learning (AMIML) model to allow small-scale attention operations for predicting gene mutations. Compared to the standard self-attention model, AMIML reduces the number of model parameters by approximately 70%. Using data for 24 clinically relevant genes from four cancer cohorts in TCGA studies (UCEC, BRCA, GBM, and KIRC), we compare AMIML with a standard self-attention model, five other deep learning models, and four traditional machine learning models. The results show that AMIML has excellent robustness and outperforms all the baseline algorithms in the vast majority of the tested genes. Conversely, the performance of the reference deep learning and machine learning models vary across different genes, and produce suboptimal prediction for certain genes. Furthermore, with the flexible and interpretable attention-based pooling mechanism, AMIML can further zero in and detect predictive image patches.


Assuntos
Algoritmos , Aprendizado de Máquina
10.
J Pathol Clin Res ; 9(3): 223-235, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-36723384

RESUMO

Many artificial intelligence models have been developed to predict clinically relevant biomarkers for colorectal cancer (CRC), including microsatellite instability (MSI). However, existing deep learning networks require large training datasets, which are often hard to obtain. In this study, based on the latest Hierarchical Vision Transformer using Shifted Windows (Swin Transformer [Swin-T]), we developed an efficient workflow to predict biomarkers in CRC (MSI, hypermutation, chromosomal instability, CpG island methylator phenotype, and BRAF and TP53 mutation) that required relatively small datasets. Our Swin-T workflow substantially achieved the state-of-the-art (SOTA) predictive performance in an intra-study cross-validation experiment on the Cancer Genome Atlas colon and rectal cancer dataset (TCGA-CRC-DX). It also demonstrated excellent generalizability in cross-study external validation and delivered a SOTA area under the receiver operating characteristic curve (AUROC) of 0.90 for MSI, using the Molecular and Cellular Oncology dataset for training (N = 1,065) and the TCGA-CRC-DX (N = 462) for testing. A similar performance (AUROC = 0.91) was reported in a recent study, using ~8,000 training samples (ResNet18) on the same testing dataset. Swin-T was extremely efficient when using small training datasets and exhibited robust predictive performance with 200-500 training samples. Our findings indicate that Swin-T could be 5-10 times more efficient than existing algorithms for MSI prediction based on ResNet18 and ShuffleNet. Furthermore, the Swin-T models demonstrated their capability in accurately predicting MSI and BRAF mutation status, which could exclude and therefore reduce samples before subsequent standard testing in a cascading diagnostic workflow, in turn reducing turnaround time and costs.


Assuntos
Neoplasias do Colo , Neoplasias Colorretais , Humanos , Instabilidade de Microssatélites , Neoplasias Colorretais/diagnóstico , Neoplasias Colorretais/genética , Proteínas Proto-Oncogênicas B-raf/genética , Inteligência Artificial , Metilação de DNA , Biomarcadores , Neoplasias do Colo/genética
11.
J Pathol Clin Res ; 9(1): 3-17, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-36376239

RESUMO

Deep learning models are increasingly being used to interpret whole-slide images (WSIs) in digital pathology and to predict genetic mutations. Currently, it is commonly assumed that tumor regions have most of the predictive power. However, it is reasonable to assume that other tissues from the tumor microenvironment may also provide important predictive information. In this paper, we propose an unsupervised clustering-based multiple-instance deep learning model for the prediction of genetic mutations using WSIs of three cancer types obtained from The Cancer Genome Atlas. Our proposed model facilitates the identification of spatial regions related to specific gene mutations and exclusion of patches that lack predictive information through the use of unsupervised clustering. This results in a more accurate prediction of gene mutations when compared with models using all image patches on WSIs and two recently published algorithms for all three different cancer types evaluated in this study. In addition, our study validates the hypothesis that the prediction of gene mutations solely based on tumor regions on WSI slides may not always provide the best performance. Other tissue types in the tumor microenvironment could provide a better prediction ability than tumor tissues alone. These results highlight the heterogeneity in the tumor microenvironment and the importance of identification of predictive image patches in digital pathology prediction tasks.


Assuntos
Aprendizado Profundo , Humanos , Análise por Conglomerados , Mutação , Microambiente Tumoral/genética , Algoritmos
12.
J Pathol Inform ; 13: 100115, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36268072

RESUMO

Background: Due to lack of annotated pathological images, transfer learning has been the predominant approach in the field of digital pathology. Pre-trained neural networks based on ImageNet database are often used to extract "off-the-shelf" features, achieving great success in predicting tissue types, molecular features, and clinical outcomes, etc. We hypothesize that fine-tuning the pre-trained models using histopathological images could further improve feature extraction, and downstream prediction performance. Methods: We used 100 000 annotated H&E image patches for colorectal cancer (CRC) to fine-tune a pre-trained Xception model via a 2-step approach. The features extracted from fine-tuned Xception (FTX-2048) model and Image-pretrained (IMGNET-2048) model were compared through: (1) tissue classification for H&E images from CRC, same image type that was used for fine-tuning; (2) prediction of immune-related gene expression, and (3) gene mutations for lung adenocarcinoma (LUAD). Five-fold cross validation was used for model performance evaluation. Each experiment was repeated 50 times. Findings: The extracted features from the fine-tuned FTX-2048 exhibited significantly higher accuracy (98.4%) for predicting tissue types of CRC compared to the "off-the-shelf" features directly from Xception based on ImageNet database (96.4%) (P value = 2.2 × 10-6). Particularly, FTX-2048 markedly improved the accuracy for stroma from 87% to 94%. Similarly, features from FTX-2048 boosted the prediction of transcriptomic expression of immune-related genes in LUAD. For the genes that had significant relationships with image features (P < 0.05, n = 171), the features from the fine-tuned model improved the prediction for the majority of the genes (139; 81%). In addition, features from FTX-2048 improved prediction of mutation for 5 out of 9 most frequently mutated genes (STK11, TP53, LRP1B, NF1, and FAT1) in LUAD. Conclusions: We proved the concept that fine-tuning the pretrained ImageNet neural networks with histopathology images can produce higher quality features and better prediction performance for not only the same-cancer tissue classification where similar images from the same cancer are used for fine-tuning, but also cross-cancer prediction for gene expression and mutation at patient level.

13.
Comput Biol Chem ; 99: 107697, 2022 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-35636264

RESUMO

The naïve empirical Bayes method has been widely used as an ad hoc tool in fitting linear mixed-effect models, which is much computationally efficient than the maximum likelihood estimation method. However, the shrinkage effect of the empirical Bayes method causes bias in the estimates of the fixed effects. Bias-correction has been proposed for the mixed-effects model when only one covariate is present. In this paper, we derive the shrinkage factor of the empirical Bayes predictors of the random effects and the variance-covariance matrix of the corrected estimates when the model has more than one covariate. The empirical Bayes estimates and test statistics are then corrected using the derived factor. Theoretical derivations, simulation studies and a real data application demonstrate the validity of the proposed method in that the corrected estimates are unbiased and the corrected tests have correct p-values.


Assuntos
Teorema de Bayes , Simulação por Computador , Modelos Lineares
14.
J Cancer Res Clin Oncol ; 148(8): 1955-1963, 2022 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-35332389

RESUMO

PURPOSE: Most of Stage II/III colorectal cancer (CRC) patients can be cured by surgery alone, and only certain CRC patients benefit from adjuvant chemotherapy. Risk stratification based on deep-learning from haematoxylin and eosin (H&E) images has been postulated as a potential predictive biomarker for benefit from adjuvant chemotherapy. However, very limited success has been achieved in using biomarkers, including deep-learning-based markers, to facilitate the decision for adjuvant chemotherapy despite recent advances of artificial intelligence. METHODS: We trained and internally validated CRCNet using 780 Stage II/III CRC patients from Molecular and Cellular Oncology. Independent external validation of the model was performed using 337 Stage II/III CRC patients from The Cancer Genome Atlas (TCGA). RESULTS: CRCNet stratified the patients into high, medium, and low-risk subgroups. Multivariate Cox regression analyses confirmed that CRCNet risk groups are statistically significant after adjusting for existing risk factors. The high-risk subgroup significantly benefits from adjuvant chemotherapy. A hazard ratio (chemo-treated vs untreated) of 0.2 (95% Confidence Interval (CI), 0.05-0.65; P = 0.009) and 0.6 (95% CI 0.42-0.98; P = 0.038) are observed in the TCGA and MCO Fluorouracil-treated patients, respectively. Conversely, no significant benefit from chemotherapy is observed in the low- and medium-risk groups (P = 0.2-1). CONCLUSION: The retrospective analysis provides further evidence that H&E image-based biomarkers may potentially be of great use in delivering treatments following surgery for Stage II/III CRC, improving patient survival, and avoiding unnecessary treatment and associated toxicity, and warrants further validation on other datasets and prospective confirmation in clinical trials.


Assuntos
Neoplasias Colorretais , Aprendizado Profundo , Protocolos de Quimioterapia Combinada Antineoplásica/uso terapêutico , Inteligência Artificial , Biomarcadores Tumorais/genética , Quimioterapia Adjuvante , Neoplasias Colorretais/patologia , Fluoruracila/uso terapêutico , Humanos , Estadiamento de Neoplasias , Prognóstico , Estudos Prospectivos , Estudos Retrospectivos
15.
Genetica ; 149(5-6): 313-325, 2021 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-34480683

RESUMO

Reducing false discoveries caused by population stratification (PS) has always been a challenge in genome-wide association studies (GWAS). The current literature established several single marker approaches including genomic control (GC), EIGENSTRAT and generalized linear mixed model association test (GMMAT) and multi-marker methods such as LASSO mixed model (LASSOMM). However, the single-marker methods require prespecifying an arbitrary p value threshold in the selection process, likely resulting in suboptimal precision or recall. On the other hand, it appears that LASSOMM is extremely computationally intensive and may not suitable for large-scale GWAS. In this paper, we proposed a simple multi-marker approach (PCA-LASSO) combining principal component analysis (PCA) and least absolute shrinkage and selection operator (LASSO). We utilize PCA to correct for the confounding effects of PS and LASSO with built-in cross-validation for a data-driven selection. Compared to the current single-marker approaches, the proposed PCA-LASSO provides optimal balance between precision and recall, and consequently superior F1 scores. Similarly, compared to LASSOMM, PCA-LASSO markedly increases the precision while minimizing the loss of recall, and therefore improves the overall F1 score in presence of PS. More importantly, PCA-LASSO drastically reduces the computational time by > 1000 times when compared to LASSOMM. We applied PCA-LASSO to a real dataset of Alzheimer's disease and successfully identified SNP rs429358 (Gene APOE4) which has been widely reported to be associated with the onset and elevated risk of Alzheimer's disease. In conclusion, PCA-LASSO is a simple, fast, but accurate approach for GWAS in presence of latent PS.


Assuntos
Predisposição Genética para Doença , Estudo de Associação Genômica Ampla/métodos , Estudo de Associação Genômica Ampla/normas , Doença de Alzheimer/genética , Conjuntos de Dados como Assunto , Genômica , Humanos , Análise de Componente Principal , Fatores de Tempo
16.
Stat Methods Med Res ; 30(1): 233-243, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-32838650

RESUMO

Nonlinear mixed-effects modeling is one of the most popular tools for analyzing repeated measurement data, particularly for applications in the biomedical fields. Multiple integration and nonlinear optimization are the two major challenges for likelihood-based methods in nonlinear mixed-effects modeling. To solve these problems, approaches based on empirical Bayesian estimates have been proposed by breaking the problem into a nonlinear mixed-effects model with no covariates and a linear regression model without random effect. This approach is time-efficient as it involves no covariates in the nonlinear optimization. However, covariate effects based on empirical Bayesian estimates are underestimated and the bias depends on the extent of shrinkage. Marginal correction method has been proposed to correct the bias caused by shrinkage to some extent. However, the marginal approach appears to be suboptimal when testing covariate effects on multiple model parameters, a situation that is often encountered in real-world data analysis. In addition, the marginal approach cannot correct the inaccuracy in the associated p-values. In this paper, we proposed a simultaneous correction method (nSCEBE), which can handle the situation where covariate analysis is performed on multiple model parameters. Simulation studies and real data analysis showed that nSCEBE is accurate and efficient for both effect-size estimation and p-value calculation compared with the existing methods. Importantly, nSCEBE can be >2000 times faster than the standard mixed-effects models, potentially allowing utilization for high-dimension covariate analysis for longitudinal or repeated measured outcomes.


Assuntos
Modelos Estatísticos , Dinâmica não Linear , Algoritmos , Teorema de Bayes , Simulação por Computador , Funções Verossimilhança
17.
Curr Genomics ; 22(5): 363-372, 2021 Dec 30.
Artigo em Inglês | MEDLINE | ID: mdl-35283669

RESUMO

Background: In genetic association studies with quantitative trait loci (QTL), the association between a candidate genetic marker and the trait of interest is commonly examined by the omnibus F test or by the t-test corresponding to a given genetic model or mode of inheritance. It is known that the t-test with a correct model specification is more powerful than the F test. However, since the underlying genetic model is rarely known in practice, the use of a model-specific t-test may incur substantial power loss. Robust-efficient tests, such as the Maximin Efficiency Robust Test (MERT) and MAX3 have been proposed in the literature. Methods: In this paper, we propose a novel two-step robust-efficient approach, namely, the genetic model selection (GMS) method for quantitative trait analysis. GMS selects a genetic model by testing Hardy-Weinberg disequilibrium (HWD) with extremal samples of the population in the first step and then applies the corresponding genetic model-specific t-test in the second step. Results: Simulations show that GMS is not only more efficient than MERT and MAX3, but also has comparable power to the optimal t-test when the genetic model is known. Conclusion: Application to the data from Alzheimer's Disease Neuroimaging Initiative (ADNI) cohort demonstrates that the proposed approach can identify meaningful biological SNPs on chromosome 19.

18.
Brief Bioinform ; 22(3)2021 05 20.
Artigo em Inglês | MEDLINE | ID: mdl-32634825

RESUMO

Genome-wide association studies (GWAS) using longitudinal phenotypes collected over time is appealing due to the improvement of power. However, computation burden has been a challenge because of the complex algorithms for modeling the longitudinal data. Approximation methods based on empirical Bayesian estimates (EBEs) from mixed-effects modeling have been developed to expedite the analysis. However, our analysis demonstrated that bias in both association test and estimation for the existing EBE-based methods remains an issue. We propose an incredibly fast and unbiased method (simultaneous correction for EBE, SCEBE) that can correct the bias in the naive EBE approach and provide unbiased P-values and estimates of effect size. Through application to Alzheimer's Disease Neuroimaging Initiative data with 6 414 695 single nucleotide polymorphisms, we demonstrated that SCEBE can efficiently perform large-scale GWAS with longitudinal outcomes, providing nearly 10 000 times improvement of computational efficiency and shortening the computation time from months to minutes. The SCEBE package and the example datasets are available at https://github.com/Myuan2019/SCEBE.


Assuntos
Algoritmos , Doença de Alzheimer/genética , Polimorfismo de Nucleotídeo Único , Software , Estudo de Associação Genômica Ampla , Humanos
19.
J Hum Genet ; 66(5): 509-518, 2021 May.
Artigo em Inglês | MEDLINE | ID: mdl-33177701

RESUMO

Mutual exclusivity analyses provide an effective tool to identify driver genes from passenger genes for cancer studies. Various algorithms have been developed for the detection of mutual exclusivity, but controlling false positive and improving accuracy remain challenging. We propose a forward selection algorithm for identification of mutually exclusive gene sets (FSME) in this paper. The method includes an initial search of seed pair of mutually exclusive (ME) genes and subsequently including more genes into the current ME set. Simulations demonstrated that, compared to recently published approaches (i.e., CoMEt, WExT, and MEGSA), FSME could provide higher precision or recall rate to identify ME gene sets, and had superior control of false positive rates. With application to TCGA real data sets for AML, BRCA, and GBM, we confirmed that FSME can be utilized to discover cancer driver genes.


Assuntos
Algoritmos , Biologia Computacional/métodos , Regulação Neoplásica da Expressão Gênica , Neoplasias/genética , Carcinogênese/genética , Reações Falso-Positivas , Humanos , Cadeias de Markov , Método de Monte Carlo , Mutagênese/genética , Oncogenes
20.
Comput Biol Chem ; 88: 107320, 2020 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-32711355

RESUMO

Family based multi-locus tests integrate information from individual loci by weighted averaging of the marginal statistics, and have been proven to be more efficient and robust than the single-locus tests in genetic association studies. The power depends on how much information the weights can extract from data. The currently published weighted sum methods are only applicable to either common or rare variants and may suffer from substantial power loss especially for rare variants. In this paper, we propose a novel data-driven weight to improve the power under both common and rare variant circumstances. We use the l1 regularization in Least Absolute Shrinkage and Selection Operator (LASSO) regression to construct the weight serving as a simultaneously adaptive marker selection process. Simulations for a dichotomous phenotype demonstrated that our LASSO-based approach outperformed the existing multi-locus methods in the sense of providing the highest statistical power while well controlled type I error rate under different scenarios. We also applied our methods to a real dataset for rheumatoid arthritis (GAW15 Problem 2). Two groups of alleles, in which individual SNPs had only modest and non-significant effects, were detected (P < 0.00001) using our proposed methods, whereas traditional multi-locus methods failed to identify them. In conclusion, the novel LASSO-based approach represents a superior weight-choosing strategy for multi-locus tests.


Assuntos
Artrite Reumatoide/genética , Modelos Logísticos , Alelos , Humanos , Fenótipo , Polimorfismo de Nucleotídeo Único/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...