Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
1.
Big Data ; 10(4): 298-312, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-35475707

RESUMO

Advertising platforms have a growing need for improving prediction quality, as missing out on ad opportunities can have a negative effect on their performance. To that end, prediction tasks such as conversion prediction need to be continuously advanced through the inclusion of data from new sources or through algorithmic development that tackles existing challenges. The introduction of different data sources naturally brings unwanted noise, whereas underexplored areas still exist in modeling approaches, such as temporal information of events in sequences. In this study, we propose extensions for modeling online user activity trails that address two very important aspects of activities-time and noise, through dedicated layers that can be used in existing deep sequence-learning approaches. Our proposed method exhibited area under the receiver operating characteristic curve improvement of up to 3% and 1.75% compared with production and best baseline approaches, respectively, across two major advertiser data sets and several predictive tasks.


Assuntos
Aprendizado de Máquina
2.
J Biomed Inform ; 105: 103409, 2020 05.
Artigo em Inglês | MEDLINE | ID: mdl-32304869

RESUMO

The accurate prediction of progression of Chronic Kidney Disease (CKD) to End Stage Renal Disease (ESRD) is of great importance to clinicians and a challenge to researchers as there are many causes and even more comorbidities that are ignored by the traditional prediction models. We examine whether utilizing a novel low-dimensional embedding model disease2disease (D2D) learned from a large-scale electronic health records (EHRs) could well clusters the causes of kidney diseases and comorbidities and further improve prediction of progression of CKD to ESRD compared to traditional risk factors. The study cohort consists of 2,507 hospitalized Stage 3 CKD patients of which 1,375 (54.8%) progressed to ESRD within 3 years. We evaluated the proposed unsupervised learning framework by applying a regularized logistic regression model and a cox proportional hazard model respectively, and compared the accuracies with the ones obtained by four alternative models. The results demonstrate that the learned low-dimensional disease representations from EHRs can capture the relationship between vast arrays of diseases, and can outperform traditional risk factors in a CKD progression prediction model. These results can be used both by clinicians in patient care and researchers to develop new prediction methods.


Assuntos
Falência Renal Crônica , Insuficiência Renal Crônica , Progressão da Doença , Taxa de Filtração Glomerular , Humanos , Falência Renal Crônica/diagnóstico , Falência Renal Crônica/epidemiologia , Insuficiência Renal Crônica/diagnóstico , Insuficiência Renal Crônica/epidemiologia , Fatores de Risco
3.
J Am Med Inform Assoc ; 26(11): 1195-1202, 2019 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-31188432

RESUMO

OBJECTIVE: Clinical trials, prospective research studies on human participants carried out by a distributed team of clinical investigators, play a crucial role in the development of new treatments in health care. This is a complex and expensive process where investigators aim to enroll volunteers with predetermined characteristics, administer treatment(s), and collect safety and efficacy data. Therefore, choosing top-enrolling investigators is essential for efficient clinical trial execution and is 1 of the primary drivers of drug development cost. MATERIALS AND METHODS: To facilitate clinical trials optimization, we propose DeepMatch (DM), a novel approach that builds on top of advances in deep learning. DM is designed to learn from both investigator and trial-related heterogeneous data sources and rank investigators based on their expected enrollment performance on new clinical trials. RESULTS: Large-scale evaluation conducted on 2618 studies provides evidence that the proposed ranking-based framework improves the current state-of-the-art by up to 19% on ranking investigators and up to 10% on detecting top/bottom performers when recruiting investigators for new clinical trials. DISCUSSION: The extensive experimental section suggests that DM can provide substantial improvement over current industry standards in several regards: (1) the enrollment potential of the investigator list, (2) the time it takes to generate the list, and (3) data-informed decisions about new investigators. CONCLUSION: Due to the great significance of the problem at hand, related research efforts are set to shift the paradigm of how investigators are chosen for clinical trials, thereby optimizing and automating them and reducing the cost of new therapies.


Assuntos
Ensaios Clínicos como Assunto/métodos , Mineração de Dados/métodos , Aprendizado Profundo , Seleção de Pacientes , Pesquisadores , Bases de Dados Factuais , Registros Eletrônicos de Saúde , Humanos , Formulário de Reclamação de Seguro
4.
J Biomed Inform ; 93: 103161, 2019 05.
Artigo em Inglês | MEDLINE | ID: mdl-30940598

RESUMO

INTRODUCTION: The objective of this study is to improve the understanding of spatial spreading of complicated cases of influenza that required hospitalizations, by creating heatmaps and social networks. They will allow to identify critical hubs and routes of spreading of Influenza, in specific geographic locations, in order to contain infections and prevent complications, that require hospitalizations. MATERIAL AND METHODS: Data were downloaded from the Healthcare Cost and Utilization Project (HCUP) - SID, New York State database. Patients hospitalized with flu complications, between 2003 and 2012 were included in the research (30,380 cases). A novel approach was designed, by constructing heatmaps for specific geographic regions in New York state and power law networks, in order to analyze distribution of hospitalized flu cases. RESULTS: Heatmaps revealed that distributions of patients follow urban areas and big roads, indicating that flu spreads along routes, that people use to travel. A scale-free network, created from correlations among zip codes, discovered that, the highest populated zip codes didn't have the largest number of patients with flu complications. Among the top five most affected zip codes, four were in Bronx. Demographics of top affected zip codes were presented in results. Normalized numbers of cases per population revealed that, none of zip codes from Bronx were in the top 20. All zip codes with the highest node degrees were in New York City area. DISCUSSION: Heatmaps identified geographic distribution of hospitalized flu patients and network analysis identified hubs of the infection. Our results will enable better estimation of resources for prevention and treatment of hospitalized patients with complications of Influenza. CONCLUSION: Analyses of geographic distribution of hospitalized patients with Influenza and demographic characteristics of populations, help us to make better planning and management of resources for Influenza patients, that require hospitalization. Obtained results could potentially help to save many lives and improve the health of the population.


Assuntos
Influenza Humana/epidemiologia , Rede Social , Hospitalização , Humanos , New York/epidemiologia , Viagem
5.
Artigo em Inglês | MEDLINE | ID: mdl-27429443

RESUMO

Increased availability of Electronic Health Record (EHR) data provides unique opportunities for improving the quality of health services. In this study, we couple EHRs with the advanced machine learning tools to predict three important parameters of healthcare quality. More specifically, we describe how to learn low-dimensional vector representations of patient conditions and clinical procedures in an unsupervised manner, and generate feature vectors of hospitalized patients useful for predicting their length of stay, total incurred charges, and mortality rates. In order to learn vector representations, we propose to employ state-of-the-art language models specifically designed for modeling co-occurrence of diseases and applied clinical procedures. The proposed model is trained on a large-scale EHR database comprising more than 35 million hospitalizations in California over a period of nine years. We compared the proposed approach to several alternatives and evaluated their effectiveness by measuring accuracy of regression and classification models used for three predictive tasks considered in this study. Our model outperformed the baseline models on all tasks, indicating a strong potential of the proposed approach for advancing quality of the healthcare system.


Assuntos
Mineração de Dados/métodos , Registros Eletrônicos de Saúde/classificação , Informática Médica/métodos , Modelos Teóricos , Indicadores de Qualidade em Assistência à Saúde , Custos Hospitalares , Humanos , Aprendizado de Máquina , Processamento de Linguagem Natural , Análise de Regressão
6.
Sci Rep ; 6: 32404, 2016 08 31.
Artigo em Inglês | MEDLINE | ID: mdl-27578529

RESUMO

Data-driven phenotype analyses on Electronic Health Record (EHR) data have recently drawn benefits across many areas of clinical practice, uncovering new links in the medical sciences that can potentially affect the well-being of millions of patients. In this paper, EHR data is used to discover novel relationships between diseases by studying their comorbidities (co-occurrences in patients). A novel embedding model is designed to extract knowledge from disease comorbidities by learning from a large-scale EHR database comprising more than 35 million inpatient cases spanning nearly a decade, revealing significant improvements on disease phenotyping over current computational approaches. In addition, the use of the proposed methodology is extended to discover novel disease-gene associations by including valuable domain knowledge from genome-wide association studies. To evaluate our approach, its effectiveness is compared against a held-out set where, again, it revealed very compelling results. For selected diseases, we further identify candidate gene lists for which disease-gene associations were not studied previously. Thus, our approach provides biomedical researchers with new tools to filter genes of interest, thus, reducing costly lab studies.


Assuntos
Registros Eletrônicos de Saúde , Doenças Genéticas Inatas/genética , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla/estatística & dados numéricos , Algoritmos , Bases de Dados Factuais , Humanos , Fenótipo
7.
Methods ; 111: 45-55, 2016 12 01.
Artigo em Inglês | MEDLINE | ID: mdl-27477211

RESUMO

Data-driven phenotype discoveries on Electronic Health Records (EHR) data have recently drawn benefits across many aspects of clinical practice. In the method described in this paper, we map a very large EHR database containing more than a million inpatient cases into a low dimensional space where diseases with similar phenotypes have similar representation. This embedding allows for an effective segmentation of diseases into more homogeneous categories, an important task of discovering disease types for precision medicine. In particular, many diseases have heterogeneous nature. For instance, sepsis, a systemic and progressive inflammation, can be caused by many factors, and can have multiple manifestations on different human organs. Understanding such heterogeneity of the disease can help in addressing many important issues regarding sepsis, including early diagnosis and treatment, which is of huge importance as sepsis is one of the main causes of in-hospital deaths in the United States. This study analyzes state of the art embedding models that have had huge success in various fields, applying them to disease embedding from EHR databases. Particular interest is given to learning multi-type representation of heterogeneous diseases, which leads to more homogeneous groups. Our results show evidence that such representations have phenotypes of higher quality and also provide benefit when predicting mortality of inpatient visits.


Assuntos
Bases de Dados Factuais , Informática Médica/métodos , Medicina de Precisão , Sepse/epidemiologia , Algoritmos , Registros Eletrônicos de Saúde , Humanos , Pacientes Internados , Sepse/fisiopatologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...