Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 34
Filter
1.
HGG Adv ; 5(2): 100273, 2024 Apr 11.
Article in English | MEDLINE | ID: mdl-38297832

ABSTRACT

Heterozygous missense variants and in-frame indels in SMC3 are a cause of Cornelia de Lange syndrome (CdLS), marked by intellectual disability, growth deficiency, and dysmorphism, via an apparent dominant-negative mechanism. However, the spectrum of manifestations associated with SMC3 loss-of-function variants has not been reported, leading to hypotheses of alternative phenotypes or even developmental lethality. We used matchmaking servers, patient registries, and other resources to identify individuals with heterozygous, predicted loss-of-function (pLoF) variants in SMC3, and analyzed population databases to characterize mutational intolerance in this gene. Here, we show that SMC3 behaves as an archetypal haploinsufficient gene: it is highly constrained against pLoF variants, strongly depleted for missense variants, and pLoF variants are associated with a range of developmental phenotypes. Among 14 individuals with SMC3 pLoF variants, phenotypes were variable but coalesced on low growth parameters, developmental delay/intellectual disability, and dysmorphism, reminiscent of atypical CdLS. Comparisons to individuals with SMC3 missense/in-frame indel variants demonstrated an overall milder presentation in pLoF carriers. Furthermore, several individuals harboring pLoF variants in SMC3 were nonpenetrant for growth, developmental, and/or dysmorphic features, and some had alternative symptomatologies with rational biological links to SMC3. Analyses of tumor and model system transcriptomic data and epigenetic data in a subset of cases suggest that SMC3 pLoF variants reduce SMC3 expression but do not strongly support clustering with functional genomic signatures of typical CdLS. Our finding of substantial population-scale LoF intolerance in concert with variable growth and developmental features in subjects with SMC3 pLoF variants expands the scope of cohesinopathies, informs on their allelic architecture, and suggests the existence of additional clearly LoF-constrained genes whose disease links will be confirmed only by multilayered genomic data paired with careful phenotyping.


Subject(s)
De Lange Syndrome , Intellectual Disability , Humans , Cell Cycle Proteins/genetics , Chondroitin Sulfate Proteoglycans/genetics , Chromosomal Proteins, Non-Histone/genetics , De Lange Syndrome/genetics , Heterozygote , Intellectual Disability/genetics , Mutation , Phenotype
2.
J Am Med Inform Assoc ; 31(2): 472-478, 2024 Jan 18.
Article in English | MEDLINE | ID: mdl-37665746

ABSTRACT

OBJECTIVE: We implemented a chatbot consent tool to shift the time burden from study staff in support of a national genomics research study. MATERIALS AND METHODS: We created an Institutional Review Board-approved script for automated chat-based consent. We compared data from prospective participants who used the tool or had traditional consent conversations with study staff. RESULTS: Chat-based consent, completed on a user's schedule, was shorter than the traditional conversation. This did not lead to a significant change in affirmative consents. Within affirmative consents and declines, more prospective participants completed the chat-based process. A quiz to assess chat-based consent user understanding had a high pass rate with no reported negative experiences. CONCLUSION: Our report shows that a structured script can convey important information while realizing the benefits of automation and burden shifting. Analysis suggests that it may be advantageous to use chatbots to scale this rate-limiting step in large research projects.


Subject(s)
Genomics , Informed Consent , Humans , Prospective Studies , Software , Communication
3.
Article in English | MEDLINE | ID: mdl-38112918

ABSTRACT

BACKGROUND: Black and Hispanic households are at elevated risk of food insecurity and insufficiency-correlates of adverse outcomes in areas such as health and mental health-relative to White households in the USA. The COVID-19 pandemic and its economic shock threatened to further exacerbate these issues. Research has identified a number of risk and protective factors for food insecurity and insufficiency. These could relate to racial and ethnic disparities in two ways-through aggregate differences in the distribution of characteristics such as educational attainment and employment or through differences in the degree of risk or protection associated with a factor. We examined the relationship between four factors-household head age, educational attainment, single mother household composition, and employment-and disparities in food insufficiency between White, Black, and Hispanic households with children during the COVID-19 pandemic to consider these pathways. METHODS: We analyzed data from the Census Bureau's Household Pulse Survey using bivariate statistics, multivariable regression, and decomposition methods to understand differences in the prevalence and consequences of underlying risk and protective factors for food insufficiency in households with children. RESULTS: Consistent with prior literature, we documented higher rates of food insufficiency among Black and Hispanic households compared to White households. Differences in the distributions of education and employment accounted for a substantial fraction of the disparities in risk. Both the distribution and degree of risk associated with single mother household composition also related to disparities, but these differences were muted after accounting for economic resources. Much, though not all, of the relationship between the distributions of education and disparate risk of food insufficiency were also captured by differences in economic resources. CONCLUSION: This study provides insight into the structure underlying racial and ethnic disparities in food insufficiency during the COVID-19 pandemic, highlighting the importance of human capital, income, and assets.

4.
medRxiv ; 2023 Sep 28.
Article in English | MEDLINE | ID: mdl-37808847

ABSTRACT

Heterozygous missense variants and in-frame indels in SMC3 are a cause of Cornelia de Lange syndrome (CdLS), marked by intellectual disability, growth deficiency, and dysmorphism, via an apparent dominant-negative mechanism. However, the spectrum of manifestations associated with SMC3 loss-of-function variants has not been reported, leading to hypotheses of alternative phenotypes or even developmental lethality. We used matchmaking servers, patient registries, and other resources to identify individuals with heterozygous, predicted loss-of-function (pLoF) variants in SMC3, and analyzed population databases to characterize mutational intolerance in this gene. Here, we show that SMC3 behaves as an archetypal haploinsufficient gene: it is highly constrained against pLoF variants, strongly depleted for missense variants, and pLoF variants are associated with a range of developmental phenotypes. Among 13 individuals with SMC3 pLoF variants, phenotypes were variable but coalesced on low growth parameters, developmental delay/intellectual disability, and dysmorphism reminiscent of atypical CdLS. Comparisons to individuals with SMC3 missense/in-frame indel variants demonstrated a milder presentation in pLoF carriers. Furthermore, several individuals harboring pLoF variants in SMC3 were nonpenetrant for growth, developmental, and/or dysmorphic features, some instead having intriguing symptomatologies with rational biological links to SMC3 including bone marrow failure, acute myeloid leukemia, and Coats retinal vasculopathy. Analyses of transcriptomic and epigenetic data suggest that SMC3 pLoF variants reduce SMC3 expression but do not result in a blood DNA methylation signature clustering with that of CdLS, and that the global transcriptional signature of SMC3 loss is model-dependent. Our finding of substantial population-scale LoF intolerance in concert with variable penetrance in subjects with SMC3 pLoF variants expands the scope of cohesinopathies, informs on their allelic architecture, and suggests the existence of additional clearly LoF-constrained genes whose disease links will be confirmed only by multi-layered genomic data paired with careful phenotyping.

5.
Clin Genet ; 104(3): 377-383, 2023 09.
Article in English | MEDLINE | ID: mdl-37194472

ABSTRACT

We evaluated the diagnostic yield using genome-slice panel reanalysis in the clinical setting using an automated phenotype/gene ranking system. We analyzed whole genome sequencing (WGS) data produced from clinically ordered panels built as bioinformatic slices for 16 clinically diverse, undiagnosed cases referred to the Pediatric Mendelian Genomics Research Center, an NHGRI-funded GREGoR Consortium site. Genome-wide reanalysis was performed using Moon™, a machine-learning-based tool for variant prioritization. In five out of 16 cases, we discovered a potentially clinically significant variant. In four of these cases, the variant was found in a gene not included in the original panel due to phenotypic expansion of a disorder or incomplete initial phenotyping of the patient. In the fifth case, the gene containing the variant was included in the original panel, but being a complex structural rearrangement with intronic breakpoints outside the clinically analyzed regions, it was not initially identified. Automated genome-wide reanalysis of clinical WGS data generated during targeted panels testing yielded a 25% increase in diagnostic findings and a possibly clinically relevant finding in one additional case, underscoring the added value of analyses versus those routinely performed in the clinical setting.


Subject(s)
Computational Biology , Genomics , Humans , Whole Genome Sequencing , Phenotype , Introns
6.
bioRxiv ; 2023 Jan 24.
Article in English | MEDLINE | ID: mdl-36747692

ABSTRACT

Objective: To conduct a retrospective analysis comparing traditional human-based consenting to an automated chat-based consenting process. Materials and Methods: We developed a new chat-based consent using our IRB-approved consent forms. We leveraged a previously developed platform (GiaⓇ, or "Genetic Information Assistant") to deliver the chat content to candidate participants. The content included information about the study, educational information, and a quiz to assess understanding. We analyzed 144 families referred to our study during a 6-month time period. A total of 37 families completed consent using the traditional process, while 35 families completed consent using Gia. Results: Engagement rates were similar between both consenting methods. The median length of the consent conversation was shorter for Gia users compared to traditional (44 vs. 76 minutes). Additionally, the total time from referral to consent completion was faster with Gia (5 vs. 16 days). Within Gia, understanding was assessed with a 10-question quiz that most participants (96%) passed. Feedback about the chat consent indicated that 86% of participants had a positive experience. Discussion: Using Gia resulted in time savings for both the participant and study staff. The chatbot enables studies to reach more potential candidates. We identified five key features related to human-centered design for developing a consent chat. Conclusion: This analysis suggests that it is feasible to use an automated chatbot to scale obtaining informed consent for a genomics research study. We further identify a number of advantages when using a chatbot.

8.
Soc Work ; 66(2): 157-166, 2021 May 13.
Article in English | MEDLINE | ID: mdl-33864085

ABSTRACT

The Temporary Assistance for Needy Families (TANF) program is a federal block grant to the states, with a required state contribution. Although often viewed as a cash assistance program with work requirements and services targeted at extremely low-income families with children, only about one-quarter of all state and federal TANF funds are now used for traditional cash aid. Uses of funds vary widely by state, and alternatives range from refundable tax credits to support of state child welfare systems. In this article, the author examines the relationship between state categorical TANF spending and key social, political, and economic characteristics using data from 2015 to 2017 and multilevel linear models. Racial and ethnic demographics of the cash assistance caseload are associated with differences in spending, with states with larger proportions of the caseload composed of people of color devoting a lower percentage of effort to traditional benefits and more to alternative cash transfers. Changes in unemployment rate within states are associated with greater spending on basic assistance and reduced spending on alternative transfers. These findings indicate that, although TANF cash benefits spending may be economically responsive within the program's overall flexible structure, spending patterns raise issues of equity for disadvantaged families.


Subject(s)
Health Expenditures , Social Work , Child , Child Welfare , Humans , Poverty , Public Policy , Social Welfare , United States
9.
Cell Rep ; 34(13): 108926, 2021 03 30.
Article in English | MEDLINE | ID: mdl-33789101

ABSTRACT

Prior studies of the renal cell carcinoma (RCC) germline landscape investigated predominantly patients of European ancestry. We examine the frequency of germline pathogenic and likely pathogenic (P/LP) variants in 1,829 patients with RCC from various ancestries. Overall, P/LP variants are found in 17% of patients, among whom 10.3% harbor one or more clinically actionable variants with potential preventive or therapeutic utility. Patients of African ancestry with RCC harbor significantly more P/LP variants in FH compared to patients of non-African ancestry with RCC and African controls from the Genome Aggregation Database (gnomAD). Patients of non-African ancestry have significantly more P/LP variants in CHEK2 compared to patients of African ancestry with RCC and non-Finnish Europeans controls. Non-Africans with RCC have more actionable variants compared to Africans with RCC. This work helps understand the underlying biological differences in RCC between Africans and non-Africans and paves the way to more comprehensive genomic characterization of underrepresented populations.


Subject(s)
Carcinoma, Renal Cell/genetics , Ethnicity/genetics , Germ-Line Mutation/genetics , Adolescent , Adult , Aged , Aged, 80 and over , Checkpoint Kinase 2/genetics , Child , Child, Preschool , Female , Genealogy and Heraldry , Genes, Neoplasm , Genetic Association Studies , Genetic Predisposition to Disease , Humans , Kidney Neoplasms/genetics , Male , Middle Aged , Penetrance , Young Adult
10.
BMC Public Health ; 20(1): 175, 2020 Feb 04.
Article in English | MEDLINE | ID: mdl-32019537

ABSTRACT

BACKGROUND: Food insecurity is widely prevalent in certain sections of society in low and middle-income countries. The United Nations has challenged all member countries to eliminate hunger for all people by 2030. This study examines the prevalence and correlates of household food insecurity among women, especially Dalit women of reproductive age in Nepal. METHODS: Data came from 2016 Nepal Demographic Health Survey, a cross-sectional, nationally representative survey that included 12,862 women between 15 and 49 years of age of which 12% were Dalit. Descriptive analysis was used to assess the prevalence of household food insecurity while logistic regression examined the relationship between women's ethnicity and the risk of food insecurity after accounting for demographic, economic, cultural, and geo-ecological characteristics. RESULTS: About 56% of all women and 76% of Dalit women had experienced food insecurity. Ethnicity is strongly related to food insecurity. Dalit women were most likely to be food insecure, even after accounting for factors such as education and wealth. They were 82, 85, 89 and 92% more vulnerable to food insecurity than Muslims, Brahmin/Chhetri, Terai Indigenous, and Hill Indigenous populations, respectively. Education was a protective factor-women with secondary education (6th to 10th grade) were 39% less likely to be food insecure compared to their counterparts without education. With a more than 10th grade education, women were 2.27 times more likely to be food secure compared to their counterparts without education. Marriage was also protective. Economically, household wealth is inversely correlated with food insecurity. Finally, residence in the Mid-Western, Far-Western and Central Development regions was correlated with food insecurity. CONCLUSION: To reduce food insecurity in Nepal, interventions should focus on improving women's education and wealth, especially among Dalit and those residing in the Far- and Mid-Western regions.


Subject(s)
Food Supply/statistics & numerical data , Adolescent , Adult , Cross-Sectional Studies , Female , Humans , Middle Aged , Nepal/epidemiology , Prevalence , Risk Factors , Young Adult
11.
Demography ; 55(6): 2119-2128, 2018 12.
Article in English | MEDLINE | ID: mdl-30242661

ABSTRACT

Homelessness in the United States is often examined using cross-sectional, point-in-time samples. Any experience of homelessness is a risk factor for adverse outcomes, so it is also useful to understand the incidence of homelessness over longer periods. We estimate the lifetime prevalence of homelessness among members of the Baby Boom cohort (n = 6,545) using the 2012 and 2014 waves of the Health and Retirement Study (HRS), a nationally representative survey of older Americans. Our analysis indicates that 6.2 % of respondents had a period of homelessness at some point in their lives. We also identify dramatic disparities in lifetime incidence of homelessness by racial and ethnic subgroups. Rates of homelessness were higher for non-Hispanic blacks (16.8 %) or Hispanics of any race (8.1 %) than for non-Hispanic whites (4.8 %; all differences significant with p < .05). The black-white gap, but not the Hispanic-white gap, remained significant after adjustment for covariates such as education, veteran status, and geographic region.


Subject(s)
Ethnicity , Ill-Housed Persons , Cross-Sectional Studies , Demography/statistics & numerical data , Female , Ill-Housed Persons/statistics & numerical data , Humans , Male , Middle Aged , Prevalence , Socioeconomic Factors , Surveys and Questionnaires , United States
12.
J Am Med Inform Assoc ; 24(1): 74-80, 2017 01.
Article in English | MEDLINE | ID: mdl-27301749

ABSTRACT

OBJECTIVE: This paper outlines the implementation of a comprehensive clinical pharmacogenomics (PGx) service within a pediatric teaching hospital and the integration of clinical decision support in the electronic health record (EHR). MATERIALS AND METHODS: An approach to clinical decision support for medication ordering and dispensing driven by documented PGx variant status in an EHR is described. A web-based platform was created to automatically generate a clinical report from either raw assay results or specified diplotypes, able to parse and combine haplotypes into an interpretation for each individual and compared to the reference lab call for accuracy. RESULTS: Clinical decision support rules built within an EHR provided guidance to providers for 31 patients (100%) who had actionable PGx variants and were written for interacting medications. A breakdown of the PGx alerts by practitioner service, and alert response for the initial cohort of patients tested is described. In 90% (355/394) of the cases, thiopurine methyltranferase genotyping was ordered pre-emptively. DISCUSSION: This paper outlines one approach to implementing a clinical PGx service in a pediatric teaching hospital that cares for a heterogeneous patient population. There is a focus on incorporation of PGx clinical decision support rules and a program to standardize report text within the electronic health record with subsequent exploration of clinician behavior in response to the alerts. CONCLUSION: The incorporation of PGx data at the time of prescribing and dispensing, if done correctly, has the potential to impact the incidence of adverse drug events, a significant cause of morbidity and mortality.


Subject(s)
Drug Therapy, Computer-Assisted , Electronic Health Records/organization & administration , Hospitals, Pediatric , Pharmacogenetics/organization & administration , Adolescent , Child , Child, Preschool , Decision Support Systems, Clinical , Female , Health Information Interoperability , Humans , Infant , Male , Pharmacogenetics/methods , Tertiary Care Centers
13.
Bioinformatics ; 31(2): 268-70, 2015 Jan 15.
Article in English | MEDLINE | ID: mdl-25273102

ABSTRACT

UNLABELLED: Biological sequence variants are commonly represented in scientific literature, clinical reports and databases of variation using the mutation nomenclature guidelines endorsed by the Human Genome Variation Society (HGVS). Despite the widespread use of the standard, no freely available and comprehensive programming libraries are available. Here we report an open-source and easy-to-use Python library that facilitates the parsing, manipulation, formatting and validation of variants according to the HGVS specification. The current implementation focuses on the subset of the HGVS recommendations that precisely describe sequence-level variation relevant to the application of high-throughput sequencing to clinical diagnostics. AVAILABILITY AND IMPLEMENTATION: The package is released under the Apache 2.0 open-source license. Source code, documentation and issue tracking are available at http://bitbucket.org/hgvs/hgvs/. Python packages are available at PyPI (https://pypi.python.org/pypi/hgvs). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Computational Biology/methods , Databases, Factual , Genetic Variation/genetics , Genome, Human , Software , Terminology as Topic , Humans , Molecular Sequence Annotation
14.
PLoS One ; 9(4): e93533, 2014.
Article in English | MEDLINE | ID: mdl-24740236

ABSTRACT

Autism is on the rise, with 1 in 88 children receiving a diagnosis in the United States, yet the process for diagnosis remains cumbersome and time consuming. Research has shown that home videos of children can help increase the accuracy of diagnosis. However the use of videos in the diagnostic process is uncommon. In the present study, we assessed the feasibility of applying a gold-standard diagnostic instrument to brief and unstructured home videos and tested whether video analysis can enable more rapid detection of the core features of autism outside of clinical environments. We collected 100 public videos from YouTube of children ages 1-15 with either a self-reported diagnosis of an ASD (N = 45) or not (N = 55). Four non-clinical raters independently scored all videos using one of the most widely adopted tools for behavioral diagnosis of autism, the Autism Diagnostic Observation Schedule-Generic (ADOS). The classification accuracy was 96.8%, with 94.1% sensitivity and 100% specificity, the inter-rater correlation for the behavioral domains on the ADOS was 0.88, and the diagnoses matched a trained clinician in all but 3 of 22 randomly selected video cases. Despite the diversity of videos and non-clinical raters, our results indicate that it is possible to achieve high classification accuracy, sensitivity, and specificity as well as clinically acceptable inter-rater reliability with nonclinical personnel. Our results also demonstrate the potential for video-based detection of autism in short, unstructured home videos and further suggests that at least a percentage of the effort associated with detection and monitoring of autism may be mobilized and moved outside of traditional clinical environments.


Subject(s)
Autistic Disorder/diagnosis , Social Media , Video Recording , Adolescent , Child , Child, Preschool , Early Diagnosis , Humans , Infant , United States
15.
Article in English | MEDLINE | ID: mdl-24303299

ABSTRACT

Advances in sequencing technology are making genomic data more accessible within the healthcare environment. Published pharmacogenetic guidelines attempt to provide a clinical context for specific genomic variants; however, the actual implementation to convert genomic data into a clinical report integrated within an electronic medical record system is a major challenge for any hospital. We created a two-part solution that integrates with the medical record system and converts genetic variant results into an interpreted clinical report based on published guidelines. We successfully developed a scalable infrastructure to support TPMT genetic testing and are currently testing approximately two individuals per week in our production version. We plan to release an online variant to clinical interpretation reporting system in order to facilitate translation of pharmacogenetic information into clinical practice.

16.
PLoS One ; 8(11): e79611, 2013.
Article in English | MEDLINE | ID: mdl-24223977

ABSTRACT

BACKGROUND: Medication nonadherence costs $300 billion annually in the US. Medicare Advantage plans have a financial incentive to increase medication adherence among members because the Centers for Medicare and Medicaid Services (CMS) now awards substantive bonus payments to such plans, based in part on population adherence to chronic medications. We sought to build an individualized surveillance model that detects early which beneficiaries will fall below the CMS adherence threshold. METHODS: This was a retrospective study of over 210,000 beneficiaries initiating statins, in a database of private insurance claims, from 2008-2011. A logistic regression model was constructed to use statin adherence from initiation to day 90 to predict beneficiaries who would not meet the CMS measure of proportion of days covered 0.8 or above, from day 91 to 365. The model controlled for 15 additional characteristics. In a sensitivity analysis, we varied the number of days of adherence data used for prediction. RESULTS: Lower adherence in the first 90 days was the strongest predictor of one-year nonadherence, with an odds ratio of 25.0 (95% confidence interval 23.7-26.5) for poor adherence at one year. The model had an area under the receiver operating characteristic curve of 0.80. Sensitivity analysis revealed that predictions of comparable accuracy could be made only 40 days after statin initiation. When members with 30-day supplies for their first statin fill had predictions made at 40 days, and members with 90-day supplies for their first fill had predictions made at 100 days, poor adherence could be predicted with 86% positive predictive value. CONCLUSIONS: To preserve their Medicare Star ratings, plan managers should identify or develop effective programs to improve adherence. An individualized surveillance approach can be used to target members who would most benefit, recognizing the tradeoff between improved model performance over time and the advantage of earlier detection.


Subject(s)
Hydroxymethylglutaryl-CoA Reductase Inhibitors/therapeutic use , Medicare Part C/economics , Medication Adherence/statistics & numerical data , Models, Statistical , Reimbursement, Incentive , Female , Humans , Male , Middle Aged , Retrospective Studies , Time Factors , United States
17.
Circulation ; 127(4): 517-26, 2013 Jan 29.
Article in English | MEDLINE | ID: mdl-23261867

ABSTRACT

BACKGROUND: Pharmacogenetics in warfarin clinical trials have failed to show a significant benefit in comparison with standard clinical therapy. This study demonstrates a computational framework to systematically evaluate preclinical trial design of target population, pharmacogenetic algorithms, and dosing protocols to optimize primary outcomes. METHODS AND RESULTS: We programmatically created an end-to-end framework that systematically evaluates warfarin clinical trial designs. The framework includes options to create a patient population, multiple dosing strategies including genetic-based and nongenetic clinical-based, multiple-dose adjustment protocols, pharmacokinetic/pharmacodynamics modeling and international normalization ratio prediction, and various types of outcome measures. We validated the framework by conducting 1000 simulations of the applying pharmacogenetic algorithms to individualize dosing of warfarin (CoumaGen) clinical trial primary end points. The simulation predicted a mean time in therapeutic range of 70.6% and 72.2% (P=0.47) in the standard and pharmacogenetic arms, respectively. Then, we evaluated another dosing protocol under the same original conditions and found a significant difference in the time in therapeutic range between the pharmacogenetic and standard arm (78.8% versus 73.8%; P=0.0065), respectively. CONCLUSIONS: We demonstrate that this simulation framework is useful in the preclinical assessment phase to study and evaluate design options and provide evidence to optimize the clinical trial for patient efficacy and reduced risk.


Subject(s)
Drug Evaluation, Preclinical/methods , Pharmacogenetics/methods , Randomized Controlled Trials as Topic/methods , Systems Theory , Thrombosis/drug therapy , Warfarin/therapeutic use , Animals , Anticoagulants/therapeutic use , Computer Simulation , Humans , Models, Theoretical , Thrombosis/genetics
18.
Article in English | MEDLINE | ID: mdl-22779042

ABSTRACT

Although a protocol aims to guide treatment management and optimize overall outcomes, the benefits and harms for each individual vary due to heterogeneity. Some protocols integrate clinical and genetic variation to provide treatment recommendation; it is not clear whether such integration is sufficient. If not, treatment outcomes may be sub-optimal for certain patient sub-populations. Unfortunately, running a clinical trial to examine such outcome responses is cost prohibitive and requires a significant amount of time to conduct the study. We propose a simulation approach to discover this knowledge from electronic medical records; a rapid method to reach this goal. We use the well-known drug warfarin as an example to examine whether patient characteristics, including race and the genes CYP2C9 and VKORC1, have been fully integrated into dosing protocols. The two genes mentioned above have been shown to be important in patient response to warfarin.

19.
PLoS Comput Biol ; 7(8): e1002147, 2011 Aug.
Article in English | MEDLINE | ID: mdl-21901085

ABSTRACT

In this overview to biomedical computing in the cloud, we discussed two primary ways to use the cloud (a single instance or cluster), provided a detailed example using NGS mapping, and highlighted the associated costs. While many users new to the cloud may assume that entry is as straightforward as uploading an application and selecting an instance type and storage options, we illustrated that there is substantial up-front effort required before an application can make full use of the cloud's vast resources. Our intention was to provide a set of best practices and to illustrate how those apply to a typical application pipeline for biomedical informatics, but also general enough for extrapolation to other types of computational problems. Our mapping example was intended to illustrate how to develop a scalable project and not to compare and contrast alignment algorithms for read mapping and genome assembly. Indeed, with a newer aligner such as Bowtie, it is possible to map the entire African genome using one m2.2xlarge instance in 48 hours for a total cost of approximately $48 in computation time. In our example, we were not concerned with data transfer rates, which are heavily influenced by the amount of available bandwidth, connection latency, and network availability. When transferring large amounts of data to the cloud, bandwidth limitations can be a major bottleneck, and in some cases it is more efficient to simply mail a storage device containing the data to AWS (http://aws.amazon.com/importexport/). More information about cloud computing, detailed cost analysis, and security can be found in references.


Subject(s)
Information Storage and Retrieval/methods , Internet , Software , Computational Biology , Computer Security , Information Storage and Retrieval/economics
20.
BMC Med Genomics ; 3: 50, 2010 Oct 29.
Article in English | MEDLINE | ID: mdl-21034472

ABSTRACT

BACKGROUND: Disease-specific genetic information has been increasing at rapid rates as a consequence of recent improvements and massive cost reductions in sequencing technologies. Numerous systems designed to capture and organize this mounting sea of genetic data have emerged, but these resources differ dramatically in their disease coverage and genetic depth. With few exceptions, researchers must manually search a variety of sites to assemble a complete set of genetic evidence for a particular disease of interest, a process that is both time-consuming and error-prone. METHODS: We designed a real-time aggregation tool that provides both comprehensive coverage and reliable gene-to-disease rankings for any disease. Our tool, called Genotator, automatically integrates data from 11 externally accessible clinical genetics resources and uses these data in a straightforward formula to rank genes in order of disease relevance. We tested the accuracy of coverage of Genotator in three separate diseases for which there exist specialty curated databases, Autism Spectrum Disorder, Parkinson's Disease, and Alzheimer Disease. Genotator is freely available at http://genotator.hms.harvard.edu. RESULTS: Genotator demonstrated that most of the 11 selected databases contain unique information about the genetic composition of disease, with 2514 genes found in only one of the 11 databases. These findings confirm that the integration of these databases provides a more complete picture than would be possible from any one database alone. Genotator successfully identified at least 75% of the top ranked genes for all three of our use cases, including a 90% concordance with the top 40 ranked candidates for Alzheimer Disease. CONCLUSIONS: As a meta-query engine, Genotator provides high coverage of both historical genetic research as well as recent advances in the genetic understanding of specific diseases. As such, Genotator provides a real-time aggregation of ranked data that remains current with the pace of research in the disease fields. Genotator's algorithm appropriately transforms query terms to match the input requirements of each targeted databases and accurately resolves named synonyms to ensure full coverage of the genetic results with official nomenclature. Genotator generates an excel-style output that is consistent across disease queries and readily importable to other applications.


Subject(s)
Algorithms , Disease/genetics , Molecular Sequence Annotation/methods , Alzheimer Disease/genetics , Child , Child Development Disorders, Pervasive/genetics , Humans , Internet , Male , Parkinson Disease/genetics , Reproducibility of Results
SELECTION OF CITATIONS
SEARCH DETAIL
...