Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
J Biomed Inform ; 95: 103208, 2019 07.
Artigo em Inglês | MEDLINE | ID: mdl-31078660

RESUMO

The Research Electronic Data Capture (REDCap) data management platform was developed in 2004 to address an institutional need at Vanderbilt University, then shared with a limited number of adopting sites beginning in 2006. Given bi-directional benefit in early sharing experiments, we created a broader consortium sharing and support model for any academic, non-profit, or government partner wishing to adopt the software. Our sharing framework and consortium-based support model have evolved over time along with the size of the consortium (currently more than 3200 REDCap partners across 128 countries). While the "REDCap Consortium" model represents only one example of how to build and disseminate a software platform, lessons learned from our approach may assist other research institutions seeking to build and disseminate innovative technologies.


Assuntos
Pesquisa Biomédica/organização & administração , Informática Médica/organização & administração , Software , Humanos , Disseminação de Informação , Internacionalidade
2.
BMC Infect Dis ; 16(1): 684, 2016 11 17.
Artigo em Inglês | MEDLINE | ID: mdl-27855652

RESUMO

BACKGROUND: Community associated methicillin-resistant Staphylococcus aureus (CA-MRSA) is one of the most common causes of skin and soft tissue infections in the United States, and a variety of genetic host factors are suspected to be risk factors for recurrent infection. Based on the CDC definition, we have developed and validated an electronic health record (EHR) based CA-MRSA phenotype algorithm utilizing both structured and unstructured data. METHODS: The algorithm was validated at three eMERGE consortium sites, and positive predictive value, negative predictive value and sensitivity, were calculated. The algorithm was then run and data collected across seven total sites. The resulting data was used in GWAS analysis. RESULTS: Across seven sites, the CA-MRSA phenotype algorithm identified a total of 349 cases and 7761 controls among the genotyped European and African American biobank populations. PPV ranged from 68 to 100% for cases and 96 to 100% for controls; sensitivity ranged from 94 to 100% for cases and 75 to 100% for controls. Frequency of cases in the populations varied widely by site. There were no plausible GWAS-significant (p < 5 E -8) findings. CONCLUSIONS: Differences in EHR data representation and screening patterns across sites may have affected identification of cases and controls and accounted for varying frequencies across sites. Future work identifying these patterns is necessary.


Assuntos
Algoritmos , Registros Eletrônicos de Saúde , Estudo de Associação Genômica Ampla/métodos , Staphylococcus aureus Resistente à Meticilina , Fenótipo , Infecções Estafilocócicas/diagnóstico , Adulto , Estudos de Casos e Controles , Infecções Comunitárias Adquiridas/diagnóstico , Infecções Comunitárias Adquiridas/genética , Feminino , Predisposição Genética para Doença , Humanos , Masculino , Fatores de Risco , Sensibilidade e Especificidade , Infecções Estafilocócicas/genética , Estados Unidos
3.
Appl Clin Inform ; 7(3): 693-706, 2016 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-27452794

RESUMO

OBJECTIVE: The objective of this study is to develop an algorithm to accurately identify children with severe early onset childhood obesity (ages 1-5.99 years) using structured and unstructured data from the electronic health record (EHR). INTRODUCTION: Childhood obesity increases risk factors for cardiovascular morbidity and vascular disease. Accurate definition of a high precision phenotype through a standardize tool is critical to the success of large-scale genomic studies and validating rare monogenic variants causing severe early onset obesity. DATA AND METHODS: Rule based and machine learning based algorithms were developed using structured and unstructured data from two EHR databases from Boston Children's Hospital (BCH) and Cincinnati Children's Hospital and Medical Center (CCHMC). Exclusion criteria including medications or comorbid diagnoses were defined. Machine learning algorithms were developed using cross-site training and testing in addition to experimenting with natural language processing features. RESULTS: Precision was emphasized for a high fidelity cohort. The rule-based algorithm performed the best overall, 0.895 (CCHMC) and 0.770 (BCH). The best feature set for machine learning employed Unified Medical Language System (UMLS) concept unique identifiers (CUIs), ICD-9 codes, and RxNorm codes. CONCLUSIONS: Detecting severe early childhood obesity is essential for the intervention potential in children at the highest long-term risk of developing comorbidities related to obesity and excluding patients with underlying pathological and non-syndromic causes of obesity assists in developing a high-precision cohort for genetic study. Further such phenotyping efforts inform future practical application in health care environments utilizing clinical decision support.


Assuntos
Aprendizado de Máquina , Obesidade Infantil/diagnóstico , Atenção Terciária à Saúde , Criança , Pré-Escolar , Comorbidade , Diagnóstico Precoce , Feminino , Humanos , Lactente , Masculino , Obesidade Infantil/epidemiologia
4.
J Am Med Inform Assoc ; 23(6): 1046-1052, 2016 11.
Artigo em Inglês | MEDLINE | ID: mdl-27026615

RESUMO

OBJECTIVE: Health care generated data have become an important source for clinical and genomic research. Often, investigators create and iteratively refine phenotype algorithms to achieve high positive predictive values (PPVs) or sensitivity, thereby identifying valid cases and controls. These algorithms achieve the greatest utility when validated and shared by multiple health care systems.Materials and Methods We report the current status and impact of the Phenotype KnowledgeBase (PheKB, http://phekb.org), an online environment supporting the workflow of building, sharing, and validating electronic phenotype algorithms. We analyze the most frequent components used in algorithms and their performance at authoring institutions and secondary implementation sites. RESULTS: As of June 2015, PheKB contained 30 finalized phenotype algorithms and 62 algorithms in development spanning a range of traits and diseases. Phenotypes have had over 3500 unique views in a 6-month period and have been reused by other institutions. International Classification of Disease codes were the most frequently used component, followed by medications and natural language processing. Among algorithms with published performance data, the median PPV was nearly identical when evaluated at the authoring institutions (n = 44; case 96.0%, control 100%) compared to implementation sites (n = 40; case 97.5%, control 100%). DISCUSSION: These results demonstrate that a broad range of algorithms to mine electronic health record data from different health systems can be developed with high PPV, and algorithms developed at one site are generally transportable to others. CONCLUSION: By providing a central repository, PheKB enables improved development, transportability, and validity of algorithms for research-grade phenotypes using health care generated data.


Assuntos
Algoritmos , Bases de Conhecimento , Fenótipo , Mineração de Dados/métodos , Registros Eletrônicos de Saúde , Genômica , Humanos , Classificação Internacional de Doenças , Processamento de Linguagem Natural
5.
J Biomed Inform ; 61: 97-109, 2016 06.
Artigo em Inglês | MEDLINE | ID: mdl-27020263

RESUMO

OBJECTIVE: Electronic medical records (EMRs) are increasingly repurposed for activities beyond clinical care, such as to support translational research and public policy analysis. To mitigate privacy risks, healthcare organizations (HCOs) aim to remove potentially identifying patient information. A substantial quantity of EMR data is in natural language form and there are concerns that automated tools for detecting identifiers are imperfect and leak information that can be exploited by ill-intentioned data recipients. Thus, HCOs have been encouraged to invest as much effort as possible to find and detect potential identifiers, but such a strategy assumes the recipients are sufficiently incentivized and capable of exploiting leaked identifiers. In practice, such an assumption may not hold true and HCOs may overinvest in de-identification technology. The goal of this study is to design a natural language de-identification framework, rooted in game theory, which enables an HCO to optimize their investments given the expected capabilities of an adversarial recipient. METHODS: We introduce a Stackelberg game to balance risk and utility in natural language de-identification. This game represents a cost-benefit model that enables an HCO with a fixed budget to minimize their investment in the de-identification process. We evaluate this model by assessing the overall payoff to the HCO and the adversary using 2100 clinical notes from Vanderbilt University Medical Center. We simulate several policy alternatives using a range of parameters, including the cost of training a de-identification model and the loss in data utility due to the removal of terms that are not identifiers. In addition, we compare policy options where, when an attacker is fined for misuse, a monetary penalty is paid to the publishing HCO as opposed to a third party (e.g., a federal regulator). RESULTS: Our results show that when an HCO is forced to exhaust a limited budget (set to $2000 in the study), the precision and recall of the de-identification of the HCO are 0.86 and 0.8, respectively. A game-based approach enables a more refined cost-benefit tradeoff, improving both privacy and utility for the HCO. For example, our investigation shows that it is possible for an HCO to release the data without spending all their budget on de-identification and still deter the attacker, with a precision of 0.77 and a recall of 0.61 for the de-identification. There also exist scenarios in which the model indicates an HCO should not release any data because the risk is too great. In addition, we find that the practice of paying fines back to a HCO (an artifact of suing for breach of contract), as opposed to a third party such as a federal regulator, can induce an elevated level of data sharing risk, where the HCO is incentivized to bait the attacker to elicit compensation. CONCLUSIONS: A game theoretic framework can be applied in leading HCO's to optimized decision making in natural language de-identification investments before sharing EMR data.


Assuntos
Confidencialidade , Registros Eletrônicos de Saúde , Processamento de Linguagem Natural , Humanos , Idioma , Risco
6.
Artigo em Inglês | MEDLINE | ID: mdl-26742152

RESUMO

OBJECTIVE: The goal of this study is to devise a machine learning framework to assist care coordination programs in prognostic stratification to design and deliver personalized care plans and to allocate financial and medical resources effectively. MATERIALS AND METHODS: This study is based on a de-identified cohort of 2,521 hypertension patients from a chronic care coordination program at the Vanderbilt University Medical Center. Patients were modeled as vectors of features derived from electronic health records (EHRs) over a six-year period. We applied a stepwise regression to identify risk factors associated with a decrease in mean arterial pressure of at least 2 mmHg after program enrollment. The resulting features were subsequently validated via a logistic regression classifier. Finally, risk factors were applied to group the patients through model-based clustering. RESULTS: We identified a set of predictive features that consisted of a mix of demographic, medication, and diagnostic concepts. Logistic regression over these features yielded an area under the ROC curve (AUC) of 0.71 (95% CI: [0.67, 0.76]). Based on these features, four clinically meaningful groups are identified through clustering - two of which represented patients with more severe disease profiles, while the remaining represented patients with mild disease profiles. DISCUSSION: Patients with hypertension can exhibit significant variation in their blood pressure control status and responsiveness to therapy. Yet this work shows that a clustering analysis can generate more homogeneous patient groups, which may aid clinicians in designing and implementing customized care programs. CONCLUSION: The study shows that predictive modeling and clustering using EHR data can be beneficial for providing a systematic, generalized approach for care providers to tailor their management approach based upon patient-level factors.

7.
Acad Med ; 90(8): 1043-50, 2015 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-25901872

RESUMO

Peer-reviewed publications are one measure of scientific productivity. From a project, program, or institutional perspective, publication tracking provides the quantitative data necessary to guide the prudent stewardship of federal, foundation, and institutional investments by identifying the scientific return for the types of support provided. In this article, the authors describe the Vanderbilt Institute for Clinical and Translational Research's (VICTR's) development and implementation of a semiautomated process through which publications are automatically detected in PubMed and adjudicated using a "just-in-time" workflow by a known pool of researchers (from Vanderbilt University School of Medicine and Meharry Medical College) who receive support from Vanderbilt's Clinical and Translational Science Award. Since implementation, the authors have (1) seen a marked increase in the number of publications citing VICTR support, (2) captured at a more granular level the relationship between specific resources/services and scientific output, (3) increased awareness of VICTR's scientific portfolio, and (4) increased efficiency in complying with annual National Institutes of Health progress reports. They present the methodological framework and workflow, measures of impact for the first 30 months, and a set of practical lessons learned to inform others considering a systems-based approach for resource and publication tracking. They learned that contacting multiple authors from a single publication can increase the accuracy of the resource attribution process in the case of multidisciplinary scientific projects. They also found that combining positive (e.g., congratulatory e-mails) and negative (e.g., not allowing future resource requests until adjudication is complete) triggers can increase compliance with publication attribution requests.


Assuntos
Bibliometria , Publicações/estatística & dados numéricos , Editoração/estatística & dados numéricos , Apoio à Pesquisa como Assunto , Pesquisa Translacional Biomédica , Órgãos Governamentais , Humanos , Revisão da Pesquisa por Pares , Projetos Piloto , Tennessee
8.
J Biomed Inform ; 52: 28-35, 2014 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-24534443

RESUMO

The last decade has seen an exponential growth in the quantity of clinical data collected nationwide, triggering an increase in opportunities to reuse the data for biomedical research. The Vanderbilt research data warehouse framework consists of identified and de-identified clinical data repositories, fee-for-service custom services, and tools built atop the data layer to assist researchers across the enterprise. Providing resources dedicated to research initiatives benefits not only the research community, but also clinicians, patients and institutional leadership. This work provides a summary of our approach in the secondary use of clinical data for research domain, including a description of key components and a list of lessons learned, designed to assist others assembling similar services and infrastructure.


Assuntos
Pesquisa Biomédica/métodos , Sistemas de Gerenciamento de Base de Dados , Informática Médica/métodos , Registros Eletrônicos de Saúde , Humanos
9.
J Am Med Inform Assoc ; 21(2): 337-44, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24045907

RESUMO

OBJECTIVE: Common chronic diseases such as hypertension are costly and difficult to manage. Our ultimate goal is to use data from electronic health records to predict the risk and timing of deterioration in hypertension control. Towards this goal, this work predicts the transition points at which hypertension is brought into, as well as pushed out of, control. METHOD: In a cohort of 1294 patients with hypertension enrolled in a chronic disease management program at the Vanderbilt University Medical Center, patients are modeled as an array of features derived from the clinical domain over time, which are distilled into a core set using an information gain criteria regarding their predictive performance. A model for transition point prediction was then computed using a random forest classifier. RESULTS: The most predictive features for transitions in hypertension control status included hypertension assessment patterns, comorbid diagnoses, procedures and medication history. The final random forest model achieved a c-statistic of 0.836 (95% CI 0.830 to 0.842) and an accuracy of 0.773 (95% CI 0.766 to 0.780). CONCLUSIONS: This study achieved accurate prediction of transition points of hypertension control status, an important first step in the long-term goal of developing personalized hypertension management plans.


Assuntos
Gerenciamento Clínico , Registros Eletrônicos de Saúde , Hipertensão/terapia , Anti-Hipertensivos/uso terapêutico , Doença Crônica , Humanos , Modelos Teóricos , Prognóstico
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...