Pesquisa | Portal Regional da BVS (teste)

Deep unsupervised feature selection by discarding nuisance and correlated features.

Shaham, Uri; Lindenbaum, Ofir; Svirsky, Jonathan; Kluger, Yuval.

Neural Netw ; 152: 34-43, 2022 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-35500458

RESUMO

Modern datasets often contain large subsets of correlated features and nuisance features, which are not or loosely related to the main underlying structures of the data. Nuisance features can be identified using the Laplacian score criterion, which evaluates the importance of a given feature via its consistency with the Graph Laplacians' leading eigenvectors. We demonstrate that in the presence of large numbers of nuisance features, the Laplacian must be computed on the subset of selected features rather than on the complete feature set. To do this, we propose a fully differentiable approach for unsupervised feature selection, utilizing the Laplacian score criterion to avoid the selection of nuisance features. We employ an autoencoder architecture to cope with correlated features, trained to reconstruct the data from the subset of selected features. Building on the recently proposed concrete layer that allows controlling for the number of selected features via architectural design, simplifying the optimization process. Experimenting on several real-world datasets, we demonstrate that our proposed approach outperforms similar approaches designed to avoid only correlated or nuisance features, but not both. Several state-of-the-art clustering results are reported. Our code is publically available at https://github.com/jsvir/lscae.

Assuntos

Análise por Conglomerados

DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network.

Katzman, Jared L; Shaham, Uri; Cloninger, Alexander; Bates, Jonathan; Jiang, Tingting; Kluger, Yuval.

BMC Med Res Methodol ; 18(1): 24, 2018 02 26.

Artigo em Inglês | MEDLINE | ID: mdl-29482517

RESUMO

BACKGROUND: Medical practitioners use survival models to explore and understand the relationships between patients' covariates (e.g. clinical and genetic features) and the effectiveness of various treatment options. Standard survival models like the linear Cox proportional hazards model require extensive feature engineering or prior medical knowledge to model treatment interaction at an individual level. While nonlinear survival methods, such as neural networks and survival forests, can inherently model these high-level interaction terms, they have yet to be shown as effective treatment recommender systems. METHODS: We introduce DeepSurv, a Cox proportional hazards deep neural network and state-of-the-art survival method for modeling interactions between a patient's covariates and treatment effectiveness in order to provide personalized treatment recommendations. RESULTS: We perform a number of experiments training DeepSurv on simulated and real survival data. We demonstrate that DeepSurv performs as well as or better than other state-of-the-art survival models and validate that DeepSurv successfully models increasingly complex relationships between a patient's covariates and their risk of failure. We then show how DeepSurv models the relationship between a patient's features and effectiveness of different treatment options to show how DeepSurv can be used to provide individual treatment recommendations. Finally, we train DeepSurv on real clinical studies to demonstrate how it's personalized treatment recommendations would increase the survival time of a set of patients. CONCLUSIONS: The predictive and modeling capabilities of DeepSurv will enable medical researchers to use deep neural networks as a tool in their exploration, understanding, and prediction of the effects of a patient's characteristics on their risk of failure.

Assuntos

Algoritmos , Redes Neurais de Computação , Avaliação de Resultados em Cuidados de Saúde/métodos , Modelos de Riscos Proporcionais , Humanos , Estimativa de Kaplan-Meier , Avaliação de Resultados em Cuidados de Saúde/estatística & dados numéricos , Medicina de Precisão/métodos

Gating mass cytometry data by deep learning.

Li, Huamin; Shaham, Uri; Stanton, Kelly P; Yao, Yi; Montgomery, Ruth R; Kluger, Yuval.

Bioinformatics ; 33(21): 3423-3430, 2017 Nov 01.

Artigo em Inglês | MEDLINE | ID: mdl-29036374

RESUMO

MOTIVATION: Mass cytometry or CyTOF is an emerging technology for high-dimensional multiparameter single cell analysis that overcomes many limitations of fluorescence-based flow cytometry. New methods for analyzing CyTOF data attempt to improve automation, scalability, performance and interpretation of data generated in large studies. Assigning individual cells into discrete groups of cell types (gating) involves time-consuming sequential manual steps, untenable for larger studies. RESULTS: We introduce DeepCyTOF, a standardization approach for gating, based on deep learning techniques. DeepCyTOF requires labeled cells from only a single sample. It is based on domain adaptation principles and is a generalization of previous work that allows us to calibrate between a target distribution and a source distribution in an unsupervised manner. We show that DeepCyTOF is highly concordant (98%) with cell classification obtained by individual manual gating of each sample when applied to a collection of 16 biological replicates of primary immune blood cells, even when measured across several instruments. Further, DeepCyTOF achieves very high accuracy on the semi-automated gating challenge of the FlowCAP-I competition as well as two CyTOF datasets generated from primary immune blood cells: (i) 14 subjects with a history of infection with West Nile virus (WNV), (ii) 34 healthy subjects of different ages. We conclude that deep learning in general, and DeepCyTOF specifically, offers a powerful computational approach for semi-automated gating of CyTOF and flow cytometry data. AVAILABILITY AND IMPLEMENTATION: Our codes and data are publicly available at https://github.com/KlugerLab/deepcytof.git. CONTACT: yuval.kluger@yale.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Biologia Computacional/métodos , Citometria de Fluxo/normas , Aprendizado de Máquina , Análise de Célula Única/normas , Células Sanguíneas/classificação , Calibragem/normas , Separação Celular/normas , Humanos , Padrões de Referência , Reprodutibilidade dos Testes

Removal of batch effects using distribution-matching residual networks.

Shaham, Uri; Stanton, Kelly P; Zhao, Jun; Li, Huamin; Raddassi, Khadir; Montgomery, Ruth; Kluger, Yuval.

Bioinformatics ; 33(16): 2539-2546, 2017 Aug 15.

Artigo em Inglês | MEDLINE | ID: mdl-28419223

RESUMO

MOTIVATION: Sources of variability in experimentally derived data include measurement error in addition to the physical phenomena of interest. This measurement error is a combination of systematic components, originating from the measuring instrument and random measurement errors. Several novel biological technologies, such as mass cytometry and single-cell RNA-seq (scRNA-seq), are plagued with systematic errors that may severely affect statistical analysis if the data are not properly calibrated. RESULTS: We propose a novel deep learning approach for removing systematic batch effects. Our method is based on a residual neural network, trained to minimize the Maximum Mean Discrepancy between the multivariate distributions of two replicates, measured in different batches. We apply our method to mass cytometry and scRNA-seq datasets, and demonstrate that it effectively attenuates batch effects. AVAILABILITY AND IMPLEMENTATION: our codes and data are publicly available at https://github.com/ushaham/BatchEffectRemoval.git. CONTACT: yuval.kluger@yale.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Biologia Computacional/métodos , Confiabilidade dos Dados , Aprendizado de Máquina , Estatística como Assunto , Citofotometria/métodos , Humanos , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA