Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 17 de 17
Filter
1.
J Atten Disord ; 22(2): 134-142, 2018 Jan.
Article in English | MEDLINE | ID: mdl-26604267

ABSTRACT

OBJECTIVE: To describe the incidence and distribution of ADHD within the United Kingdom, and to examine whether there was any association between ADHD incidence and socioeconomic deprivation. METHOD: The study used data from the Clinical Practice Research Datalink (CPRD). Patients diagnosed with ADHD before the age of 19 between January 1, 2004 and December 31, 2013 were stratified according to the region in which their general practice was based. Practice Index of Multiple Deprivation (IMD) score was used as a surrogate measure of patients' deprivation status. RESULTS: ADHD incidence was relatively stable between 2004 and 2013, but peaked in the last 2 years studied. Statistically significant ( p ≤ .05) differences in incidence were observed between U.K. regions. In almost every year studied, incidence rates were highest among the most deprived patients and lowest among the least deprived patients. CONCLUSION: In the United Kingdom, ADHD may be associated with socioeconomic deprivation.


Subject(s)
Attention Deficit Disorder with Hyperactivity/epidemiology , Adolescent , Child , Female , Humans , Incidence , Male , Psychosocial Deprivation , Socioeconomic Factors , United Kingdom/epidemiology
2.
PLoS One ; 12(2): e0171784, 2017.
Article in English | MEDLINE | ID: mdl-28231289

ABSTRACT

Research with structured Electronic Health Records (EHRs) is expanding as data becomes more accessible; analytic methods advance; and the scientific validity of such studies is increasingly accepted. However, data science methodology to enable the rapid searching/extraction, cleaning and analysis of these large, often complex, datasets is less well developed. In addition, commonly used software is inadequate, resulting in bottlenecks in research workflows and in obstacles to increased transparency and reproducibility of the research. Preparing a research-ready dataset from EHRs is a complex and time consuming task requiring substantial data science skills, even for simple designs. In addition, certain aspects of the workflow are computationally intensive, for example extraction of longitudinal data and matching controls to a large cohort, which may take days or even weeks to run using standard software. The rEHR package simplifies and accelerates the process of extracting ready-for-analysis datasets from EHR databases. It has a simple import function to a database backend that greatly accelerates data access times. A set of generic query functions allow users to extract data efficiently without needing detailed knowledge of SQL queries. Longitudinal data extractions can also be made in a single command, making use of parallel processing. The package also contains functions for cutting data by time-varying covariates, matching controls to cases, unit conversion and construction of clinical code lists. There are also functions to synthesise dummy EHR. The package has been tested with one for the largest primary care EHRs, the Clinical Practice Research Datalink (CPRD), but allows for a common interface to other EHRs. This simplified and accelerated work flow for EHR data extraction results in simpler, cleaner scripts that are more easily debugged, shared and reproduced.


Subject(s)
Electronic Health Records , Databases, Factual , Electronic Health Records/economics , Humans , Information Storage and Retrieval/economics , Software/economics , Time Factors , Workflow
3.
BMC Res Notes ; 10(1): 41, 2017 Jan 13.
Article in English | MEDLINE | ID: mdl-28086961

ABSTRACT

BACKGROUND: In modern health care systems, the computerization of all aspects of clinical care has led to the development of large data repositories. For example, in the UK, large primary care databases hold millions of electronic medical records, with detailed information on diagnoses, treatments, outcomes and consultations. Careful analyses of these observational datasets of routinely collected data can complement evidence from clinical trials or even answer research questions that cannot been addressed in an experimental setting. However, 'missingness' is a common problem for routinely collected data, especially for biological parameters over time. Absence of complete data for the whole of a individual's study period is a potential bias risk and standard complete-case approaches may lead to biased estimates. However, the structure of the data values makes standard cross-sectional multiple-imputation approaches unsuitable. In this paper we propose and evaluate mibmi, a new command for cleaning and imputing longitudinal body mass index data. RESULTS: The regression-based data cleaning aspects of the algorithm can be useful when researchers analyze messy longitudinal data. Although the multiple imputation algorithm is computationally expensive, it performed similarly or even better to existing alternatives, when interpolating observations. CONCLUSION: The mibmi algorithm can be a useful tool for analyzing longitudinal body mass index data, or other longitudinal data with very low individual-level variability.


Subject(s)
Body Mass Index , Algorithms , Humans , Longitudinal Studies , United Kingdom
4.
PLoS One ; 11(2): e0146715, 2016.
Article in English | MEDLINE | ID: mdl-26918439

ABSTRACT

BACKGROUND: The use of Electronic Health Records databases for medical research has become mainstream. In the UK, increasing use of Primary Care Databases is largely driven by almost complete computerisation and uniform standards within the National Health Service. Electronic Health Records research often begins with the development of a list of clinical codes with which to identify cases with a specific condition. We present a methodology and accompanying Stata and R commands (pcdsearch/Rpcdsearch) to help researchers in this task. We present severe mental illness as an example. METHODS: We used the Clinical Practice Research Datalink, a UK Primary Care Database in which clinical information is largely organised using Read codes, a hierarchical clinical coding system. Pcdsearch is used to identify potentially relevant clinical codes and/or product codes from word-stubs and code-stubs suggested by clinicians. The returned code-lists are reviewed and codes relevant to the condition of interest are selected. The final code-list is then used to identify patients. RESULTS: We identified 270 Read codes linked to SMI and used them to identify cases in the database. We observed that our approach identified cases that would have been missed with a simpler approach using SMI registers defined within the UK Quality and Outcomes Framework. CONCLUSION: We described a framework for researchers of Electronic Health Records databases, for identifying patients with a particular condition or matching certain clinical criteria. The method is invariant to coding system or database and can be used with SNOMED CT, ICD or other medical classification code-lists.


Subject(s)
Electronic Health Records , Mental Disorders , Algorithms , Clinical Coding , Databases, Factual , Humans , Models, Statistical , Primary Health Care , United Kingdom
5.
BMJ Qual Saf ; 25(9): 657-70, 2016 09.
Article in English | MEDLINE | ID: mdl-26628553

ABSTRACT

OBJECTIVES: The UK's Quality and Outcomes Framework permits practices to exempt patients from financially-incentivised performance targets. To better understand the determinants and consequences of being exempted from the framework, we investigated the associations between exception reporting, patient characteristics and mortality. We also quantified the proportion of exempted patients that met quality targets for a tracer condition (diabetes). DESIGN: Retrospective longitudinal study, using individual patient data from the Clinical Practice Research Datalink. SETTING: 644 general practices, 2006/7 to 2011/12. PARTICIPANTS: Patients registered with study practices for at least one year over the study period, with at least one condition of interest (2 460 341 in total). MAIN OUTCOME MEASURES: Exception reporting rates by reason (clinical contraindication, patient dissent); all-cause mortality in year following exemption. Analyses with logistic and Cox proportional-hazards regressions, respectively. RESULTS: The odds of being exempted increased with age, deprivation and multimorbidity. Men were more likely to be exempted but this was largely attributable to higher prevalence of conditions with high exemption rates. Modest associations remained, with women more likely to be exempted due to clinical contraindication (OR 0.90, 99% CI 0.88 to 0.92) and men more likely to be exempted due to informed dissent (OR 1.08, 99% CI 1.06 to 1.10). More deprived areas (both for practice location and patient residence) were non-linearly associated with higher exception rates, after controlling for comorbidities and other covariates, with stronger associations for clinical contraindication. Compared with patients with a single condition, odds ratios for patients with two, three, or four or more conditions were respectively 4.28 (99% CI 4.18 to 4.38), 16.32 (99% CI 15.82 to 16.83) and 68.69 (99% CI 66.12 to 71.37) for contraindication, and 2.68 (99% CI 2.63 to 2.74), 4.02 (99% CI 3.91 to 4.13) and 5.17 (99% CI 5.00 to 5.35) for informed dissent. Exempted patients had a higher adjusted risk of death in the following year than non-exempted patients, regardless of whether this exemption was for contraindication (hazard ratio 1.37, 99% CI 1.33 to 1.40) or for informed dissent (1.20, 99% CI 1.17 to 1.24). On average, quality standards were met for 48% of exempted patients in the diabetes domain, but there was wide variation across indicators (ranging from 8 to 80%). CONCLUSIONS: Older, multimorbid and more deprived patients are more likely to be exempted from the scheme. Exception reported patients are more likely to die in the following year, whether they are exempted by the practice for a contraindication or by themselves through informed dissent. Further research is needed to understand the relationship between exception reporting and patient outcomes.


Subject(s)
Primary Health Care , Reimbursement, Incentive , Survival Analysis , Aged , Aged, 80 and over , Family Practice , Female , Humans , Male , Middle Aged , Quality Indicators, Health Care , Retrospective Studies , United Kingdom
7.
Stat Med ; 34(20): 2781-93, 2015 Sep 10.
Article in English | MEDLINE | ID: mdl-25988604

ABSTRACT

UNLABELLED: We used a Bayesian hierarchical selection model to study publication bias in 1106 meta-analyses from the Cochrane Database of Systematic Reviews comparing treatment with either placebo or no treatment. For meta-analyses of efficacy, we estimated the ratio of the probability of including statistically significant outcomes favoring treatment to the probability of including other outcomes. For meta-analyses of safety, we estimated the ratio of the probability of including results showing no evidence of adverse effects to the probability of including results demonstrating the presence of adverse effects. RESULTS: In the meta-analyses of efficacy, outcomes favoring treatment had on average a 27% (95% Credible Interval (CI): 18% to 36%) higher probability to be included than other outcomes. In the meta-analyses of safety, results showing no evidence of adverse effects were on average 78% (95% CI: 51% to 113%) more likely to be included than results demonstrating that adverse effects existed. In general, the amount of over-representation of findings favorable to treatment was larger in meta-analyses including older studies. CONCLUSIONS: In the largest study on publication bias in meta-analyses to date, we found evidence of publication bias in Cochrane systematic reviews. In general, publication bias is smaller in meta-analyses of more recent studies, indicating their better reliability and supporting the effectiveness of the measures used to reduce publication bias in clinical trials. Our results indicate the need to apply currently underutilized meta-analysis tools handling publication bias based on the statistical significance, especially when studies included in a meta-analysis are not recent.


Subject(s)
Databases, Factual , Meta-Analysis as Topic , Publication Bias , Review Literature as Topic , Bayes Theorem , Clinical Trials as Topic , Placebo Effect
9.
BMJ Open ; 5(4): e007299, 2015 Apr 13.
Article in English | MEDLINE | ID: mdl-25869690

ABSTRACT

OBJECTIVES: To conduct a fully independent, external validation of a research study based on one electronic health record database using a different database sampling from the same population. DESIGN: Retrospective cohort analysis of ß-blocker therapy and all-cause mortality in patients with cancer. SETTING: Two UK national primary care databases (PCDs): the Clinical Practice Research Datalink (CPRD) and Doctors' Independent Network (DIN). PARTICIPANTS: CPRD data for 11,302 patients with cancer compared with published results from DIN for 3462 patients; study period January 1997 to December 2006. PRIMARY AND SECONDARY OUTCOME MEASURES: All-cause mortality: overall; by treatment subgroup (ß-blockers only, ß-blockers plus other blood pressure lowering medicines (BPLM), other BPLMs only); and by cancer site. RESULTS: Using CPRD, ß-blocker use was not associated with mortality (HR=1.03, 95% CI 0.93 to 1.14, vs patients prescribed other BPLMs only), but DIN ß-blocker users had significantly higher mortality (HR=1.18, 95% CI 1.04 to 1.33). However, these HRs were not statistically different (p=0.063), but did differ for patients on ß-blockers alone (CPRD=0.94, 95% CI 0.82 to 1.07; DIN=1.37, 95% CI 1.16 to 1.61; p<0.001). Results for individual cancer sites differed by study, but only significantly for prostate and pancreas cancers. Results were robust under sensitivity analyses, but we could not be certain that mortality was identically defined in both databases. CONCLUSIONS: We found a complex pattern of similarities and differences between databases. Overall treatment effect estimates were not statistically different, adding to a growing body of evidence that different UK PCDs produce comparable effect estimates. However, individually the two studies lead to different conclusions regarding the safety of ß-blockers and some subgroup effects differed significantly. Single studies using even internally well-validated databases do not guarantee generalisable results, especially for subgroups, and confirmatory studies using at least one other independent data source are strongly recommended.


Subject(s)
Adrenergic beta-Antagonists/administration & dosage , Antihypertensive Agents/administration & dosage , Electronic Health Records , Neoplasms/mortality , Primary Health Care/statistics & numerical data , Databases, Factual , Humans , Neoplasms/drug therapy , Retrospective Studies , United Kingdom/epidemiology
10.
BMJ ; 350: h904, 2015 Mar 02.
Article in English | MEDLINE | ID: mdl-25733592

ABSTRACT

OBJECTIVES: To quantify the relationship between a national primary care pay-for-performance programme, the UK's Quality and Outcomes Framework (QOF), and all-cause and cause-specific premature mortality linked closely with conditions included in the framework. DESIGN: Longitudinal spatial study, at the level of the "lower layer super output area" (LSOA). SETTING: 32482 LSOAs (neighbourhoods of 1500 people on average), covering the whole population of England (approximately 53.5 million), from 2007 to 2012. PARTICIPANTS: 8647 English general practices participating in the QOF for at least one year of the study period, including over 99% of patients registered with primary care. INTERVENTION: National pay-for-performance programme incentivising performance on over 100 quality-of-care indicators. MAIN OUTCOME MEASURES: All-cause and cause-specific mortality rates for six chronic conditions: diabetes, heart failure, hypertension, ischaemic heart disease, stroke, and chronic kidney disease. We used multiple linear regressions to investigate the relationship between spatially estimated recorded quality of care and mortality. RESULTS: All-cause and cause-specific mortality rates declined over the study period. Higher mortality was associated with greater area deprivation, urban location, and higher proportion of a non-white population. In general, there was no significant relationship between practice performance on quality indicators included in the QOF and all-cause or cause-specific mortality rates in the practice locality. CONCLUSIONS: Higher reported achievement of activities incentivised under a major, nationwide pay-for-performance programme did not seem to result in reduced incidence of premature death in the population.


Subject(s)
Chronic Disease/therapy , Mortality, Premature , Primary Health Care/standards , Cause of Death , Chronic Disease/mortality , England/epidemiology , General Practice/standards , General Practice/statistics & numerical data , Humans , Longitudinal Studies , Primary Health Care/statistics & numerical data , Professional Practice/standards , Professional Practice/statistics & numerical data , Quality Indicators, Health Care , Reimbursement, Incentive/standards , Reimbursement, Incentive/statistics & numerical data , Residence Characteristics/statistics & numerical data
12.
Diabetologia ; 58(3): 505-18, 2015 Mar.
Article in English | MEDLINE | ID: mdl-25512005

ABSTRACT

AIMS/HYPOTHESIS: We aimed to describe the shape of observed relationships between risk factor levels and clinically important outcomes in type 2 diabetes after adjusting for multiple confounders. METHODS: We used retrospective longitudinal data on 246,544 adults with type 2 diabetes from 600 practices in the Clinical Practice Research Datalink, 2006-2012. Proportional hazards regression models quantified the risks of mortality and microvascular or macrovascular events associated with four modifiable biological variables (HbA1c, systolic BP, diastolic BP and total cholesterol), while controlling for important patient and practice covariates. RESULTS: U-shaped relationships were observed between all-cause mortality and levels of the four biometric risk factors. Lowest risks were associated with HbA1c 7.25-7.75% (56-61 mmol/mol), total cholesterol 3.5-4.5 mmol/l, systolic BP 135-145 mmHg and diastolic BP 82.5-87.5 mmHg. Coronary and stroke mortality related to the four risk factors in a positive, curvilinear way, with the exception of systolic BP, which related to deaths in a U-shape. Macrovascular events showed a positive and curvilinear relationship with HbA1c but a U-shaped relationship with total cholesterol and systolic BP. Microvascular events related to the four risk factors in a curvilinear way: positive for HbA1c and systolic BP but negative for cholesterol and diastolic BP. CONCLUSIONS/INTERPRETATION: We identified several relationships that support a call for major changes to clinical practice. Most importantly, our results support trial data indicating that normalisation of glucose and BP can lead to poorer outcomes. This makes a strong case for target ranges for these risk factors rather than target levels.


Subject(s)
Blood Glucose/physiology , Blood Pressure/physiology , Cholesterol/blood , Diabetes Mellitus, Type 2/blood , Diabetes Mellitus, Type 2/pathology , Adult , Aged , Diabetes Mellitus, Type 2/physiopathology , Female , Humans , Male , Middle Aged , Retrospective Studies
13.
PLoS One ; 9(6): e99825, 2014.
Article in English | MEDLINE | ID: mdl-24941260

ABSTRACT

Lists of clinical codes are the foundation for research undertaken using electronic medical records (EMRs). If clinical code lists are not available, reviewers are unable to determine the validity of research, full study replication is impossible, researchers are unable to make effective comparisons between studies, and the construction of new code lists is subject to much duplication of effort. Despite this, the publication of clinical codes is rarely if ever a requirement for obtaining grants, validating protocols, or publishing research. In a representative sample of 450 EMR primary research articles indexed on PubMed, we found that only 19 (5.1%) were accompanied by a full set of published clinical codes and 32 (8.6%) stated that code lists were available on request. To help address these problems, we have built an online repository where researchers using EMRs can upload and download lists of clinical codes. The repository will enable clinical researchers to better validate EMR studies, build on previous code lists and compare disease definitions across studies. It will also assist health informaticians in replicating database studies, tracking changes in disease definitions or clinical coding practice through time and sharing clinical code information across platforms and data sources as research objects.


Subject(s)
Clinical Coding , Databases, Factual , Electronic Health Records , Internet , Research , Humans , Primary Health Care , Publications , Reproducibility of Results , Research Report
14.
BMJ Open ; 4(4): e004952, 2014 Apr 23.
Article in English | MEDLINE | ID: mdl-24760353

ABSTRACT

OBJECTIVE: To conduct a fully independent and external validation of a research study based on one electronic health record database, using a different electronic database sampling the same population. DESIGN: Using the Clinical Practice Research Datalink (CPRD), we replicated a published investigation into the effects of statins in patients with ischaemic heart disease (IHD) by a different research team using QResearch. We replicated the original methods and analysed all-cause mortality using: (1) a cohort analysis and (2) a case-control analysis nested within the full cohort. SETTING: Electronic health record databases containing longitudinal patient consultation data from large numbers of general practices distributed throughout the UK. PARTICIPANTS: CPRD data for 34 925 patients with IHD from 224 general practices, compared to previously published results from QResearch for 13 029 patients from 89 general practices. The study period was from January 1996 to December 2003. RESULTS: We successfully replicated the methods of the original study very closely. In a cohort analysis, risk of death was lower by 55% for patients on statins, compared with 53% for QResearch (adjusted HR 0.45, 95% CI 0.40 to 0.50; vs 0.47, 95% CI 0.41 to 0.53). In case-control analyses, patients on statins had a 31% lower odds of death, compared with 39% for QResearch (adjusted OR 0.69, 95% CI 0.63 to 0.75; vs OR 0.61, 95% CI 0.52 to 0.72). Results were also close for individual statins. CONCLUSIONS: Database differences in population characteristics and in data definitions, recording, quality and completeness had a minimal impact on key statistical outputs. The results uphold the validity of research using CPRD and QResearch by providing independent evidence that both datasets produce very similar estimates of treatment effect, leading to the same clinical and policy decisions. Together with other non-independent replication studies, there is a nascent body of evidence for wider validity.


Subject(s)
Electronic Health Records , Hydroxymethylglutaryl-CoA Reductase Inhibitors/therapeutic use , Mortality , Myocardial Ischemia/drug therapy , Adolescent , Adult , Age Factors , Aged , Aged, 80 and over , Case-Control Studies , Child , Child, Preschool , Databases, Factual , Diabetes Complications , Female , Humans , Infant , Infant, Newborn , Kaplan-Meier Estimate , Male , Middle Aged , Myocardial Ischemia/mortality , Risk Factors , Sex Factors , United Kingdom , Young Adult
15.
Glob Chang Biol ; 20(2): 456-65, 2014 Feb.
Article in English | MEDLINE | ID: mdl-24130095

ABSTRACT

Significant changes in plant phenology have been observed in response to increases in mean global temperatures. There are concerns that accelerated phenologies can negatively impact plant populations. However, the fitness consequence of changes in phenology in response to elevated temperature is not well understood, particularly under field conditions. We address this issue by exposing a set of recombinant inbred lines of Arabidopsis thaliana to a simulated global warming treatment in the field. We find that plants exposed to elevated temperatures flower earlier, as predicted by photothermal models. However, contrary to life-history trade-off expectations, they also flower at a larger vegetative size, suggesting that warming probably causes acceleration in vegetative development. Although warming increases mean fitness (fruit production) by ca. 25%, there is a significant genotype-by-environment interaction. Changes in fitness rank indicate that imminent climate change can cause populations to be maladapted in their new environment, if adaptive evolution is limited. Thus, changes in the genetic composition of populations are likely, depending on the species' generation time and the speed of temperature change. Interestingly, genotypes that show stronger phenological responses have higher fitness under elevated temperatures, suggesting that phenological sensitivity might be a good indicator of success under elevated temperature at the genotypic level as well as at the species level.


Subject(s)
Arabidopsis/physiology , Climate Change , Genetic Fitness , Global Warming , Arabidopsis/genetics , Arabidopsis/growth & development , Hot Temperature , Species Specificity , Time Factors
16.
PLoS One ; 8(7): e69930, 2013.
Article in English | MEDLINE | ID: mdl-23922860

ABSTRACT

BACKGROUND: Heterogeneity has a key role in meta-analysis methods and can greatly affect conclusions. However, true levels of heterogeneity are unknown and often researchers assume homogeneity. We aim to: a) investigate the prevalence of unobserved heterogeneity and the validity of the assumption of homogeneity; b) assess the performance of various meta-analysis methods; c) apply the findings to published meta-analyses. METHODS AND FINDINGS: We accessed 57,397 meta-analyses, available in the Cochrane Library in August 2012. Using simulated data we assessed the performance of various meta-analysis methods in different scenarios. The prevalence of a zero heterogeneity estimate in the simulated scenarios was compared with that in the Cochrane data, to estimate the degree of unobserved heterogeneity in the latter. We re-analysed all meta-analyses using all methods and assessed the sensitivity of the statistical conclusions. Levels of unobserved heterogeneity in the Cochrane data appeared to be high, especially for small meta-analyses. A bootstrapped version of the DerSimonian-Laird approach performed best in both detecting heterogeneity and in returning more accurate overall effect estimates. Re-analysing all meta-analyses with this new method we found that in cases where heterogeneity had originally been detected but ignored, 17-20% of the statistical conclusions changed. Rates were much lower where the original analysis did not detect heterogeneity or took it into account, between 1% and 3%. CONCLUSIONS: When evidence for heterogeneity is lacking, standard practice is to assume homogeneity and apply a simpler fixed-effect meta-analysis. We find that assuming homogeneity often results in a misleading analysis, since heterogeneity is very likely present but undetected. Our new method represents a small improvement but the problem largely remains, especially for very small meta-analyses. One solution is to test the sensitivity of the meta-analysis conclusions to assumed moderate and large degrees of heterogeneity. Equally, whenever heterogeneity is detected, it should not be ignored.


Subject(s)
Databases as Topic , Library Materials , Meta-Analysis as Topic , Computer Simulation , Models, Theoretical
17.
J R Soc Interface ; 10(82): 20121032, 2013 May 06.
Article in English | MEDLINE | ID: mdl-23427095

ABSTRACT

Many biological characteristics of evolutionary interest are not scalar variables but continuous functions. Given a dataset of function-valued traits generated by evolution, we develop a practical, statistical approach to infer ancestral function-valued traits, and estimate the generative evolutionary process. We do this by combining dimension reduction and phylogenetic Gaussian process regression, a non-parametric procedure that explicitly accounts for known phylogenetic relationships. We test the performance of methods on simulated, function-valued data generated from a stochastic evolutionary model. The methods are applied assuming that only the phylogeny, and the function-valued traits of taxa at its tips are known. Our method is robust and applicable to a wide range of function-valued data, and also offers a phylogenetically aware method for estimating the autocorrelation of function-valued traits.


Subject(s)
Evolution, Molecular , Models, Genetic , Phylogeny , Quantitative Trait Loci/physiology , Animals , Humans , Normal Distribution , Stochastic Processes
SELECTION OF CITATIONS
SEARCH DETAIL
...