Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 21
Filter
1.
Database (Oxford) ; 20192019 01 01.
Article in English | MEDLINE | ID: mdl-30942863

ABSTRACT

Timely, consistent and integrated access to clinical trial data remains one of the pharmaceutical industry's most pressing needs. As part of a comprehensive clinical data repository, we have developed a data warehouse that can integrate operational data from any source, conform it to a canonical data model and make it accessible to study teams in a timely, secure and contextualized manner to support operational oversight, proactive risk management and other analytic and reporting needs. Our solution consists of a dimensional relational data warehouse, a set of extraction, transformation and loading processes to coordinate data ingestion and mapping, a generalizable metrics engine to enable the computation of operational metrics and key performance, quality and risk indicators and a set of graphical user interfaces to facilitate configuration, management and administration. When combined with the appropriate data visualization tools, the warehouse enables convenient access to raw operational data and derived metrics to help track study conduct and performance, identify and mitigate risks, monitor and improve operational processes, manage resource allocation, strengthen investigator and sponsor relationships and other purposes.


Subject(s)
Clinical Trials as Topic , Data Warehousing , Database Management Systems , Humans , Research Report
2.
Database (Oxford) ; 20192019 01 01.
Article in English | MEDLINE | ID: mdl-30854563

ABSTRACT

Clinical trial data are typically collected through multiple systems developed by different vendors using different technologies and data standards. That data need to be integrated, standardized and transformed for a variety of monitoring and reporting purposes. The need to process large volumes of often inconsistent data in the presence of ever-changing requirements poses a significant technical challenge. As part of a comprehensive clinical data repository, we have developed a data warehouse that integrates patient data from any source, standardizes it and makes it accessible to study teams in a timely manner to support a wide range of analytic tasks for both in-flight and completed studies. Our solution combines Apache HBase, a NoSQL column store, Apache Phoenix, a massively parallel relational query engine and a user-friendly interface to facilitate efficient loading of large volumes of data under incomplete or ambiguous specifications, utilizing an extract-load-transform design pattern that defers data mapping until query time. This approach allows us to maintain a single copy of the data and transform it dynamically into any desirable format without requiring additional storage. Changes to the mapping specifications can be easily introduced and multiple representations of the data can be made available concurrently. Further, by versioning the data and the transformations separately, we can apply historical maps to current data or current maps to historical data, which simplifies the maintenance of data cuts and facilitates interim analyses for adaptive trials. The result is a highly scalable, secure and redundant solution that combines the flexibility of a NoSQL store with the robustness of a relational query engine to support a broad range of applications, including clinical data management, medical review, risk-based monitoring, safety signal detection, post hoc analysis of completed studies and many others.


Subject(s)
Clinical Trials as Topic , Data Warehousing , Database Management Systems , Humans , Machine Learning , User-Computer Interface
3.
Database (Oxford) ; 20192019 01 01.
Article in English | MEDLINE | ID: mdl-30773591

ABSTRACT

Assembly of complete and error-free clinical trial data sets for statistical analysis and regulatory submission requires extensive effort and communication among investigational sites, central laboratories, pharmaceutical sponsors, contract research organizations and other entities. Traditionally, this data is captured, cleaned and reconciled through multiple disjointed systems and processes, which is resource intensive and error prone. Here, we introduce a new system for clinical data review that helps data managers identify missing, erroneous and inconsistent data and manage queries in a unified, system-agnostic and efficient way. Our solution enables timely and integrated access to all study data regardless of source, facilitates the review of validation and discrepancy checks and the management of the resulting queries, tracks the status of page review, verification and locking activities, monitors subject data cleanliness and readiness for database lock and provides extensive configuration options to meet any study's needs, automation for regular updates and fit-for-purpose user interfaces for global oversight and problem detection.


Subject(s)
Clinical Trials as Topic , Databases as Topic , Data Warehousing
4.
JAMIA Open ; 2(2): 216-221, 2019 Jul.
Article in English | MEDLINE | ID: mdl-31984356

ABSTRACT

OBJECTIVE: We present a new system to track, manage, and report on all risks and issues encountered during a clinical trial. MATERIALS AND METHODS: Our solution utilizes JIRA, a popular issue and project tracking tool for software development, augmented by third-party and custom-built plugins to provide the additional functionality missing from the core product. RESULTS: The new system integrates all issue types under a single tracking tool and offers a range of capabilities, including configurable issue management workflows, seamless integration with other clinical systems, extensive history, reporting, and trending, and an intuitive web interface. DISCUSSION AND CONCLUSION: By preserving the linkage between risks, issues, actions, decisions, and outcomes, the system allows study teams to assess the impact and effectiveness of their risk management strategies and present a coherent account of how the trial was conducted. Since the tool was put in production, we have observed an increase in the number of reported issues and a decrease in the median issue resolution time which, along with the positive user feedback, point to marked improvements in quality, transparency, productivity, and teamwork.

5.
Clin Ther ; 40(7): 1204-1212, 2018 07.
Article in English | MEDLINE | ID: mdl-30100201

ABSTRACT

PURPOSE: Clinical trial monitoring is an essential component of drug development aimed at safeguarding subject safety, data quality, and protocol compliance by focusing sponsor oversight on the most important aspects of study conduct. In recent years, regulatory agencies, industry consortia, and nonprofit collaborations between industry and regulators, such as TransCelerate and International Committee for Harmonization, have been advocating a new, risk-based approach to monitoring clinical trials that places increased emphasis on critical data and processes and encourages greater use of centralized monitoring. However, how best to implement risk-based monitoring (RBM) remains unclear and subject to wide variations in tools and methodologies. The nonprescriptive nature of the regulatory guidelines, coupled with limitations in software technology, challenges in operationalization, and lack of robust evidence of superior outcomes, have hindered its widespread adoption. METHODS: We describe a holistic solution that combines convenient access to data, advanced analytics, and seamless integration with established technology infrastructure to enable comprehensive assessment and mitigation of risk at the study, site, and subject level. FINDINGS: Using data from completed RBM studies carried out in the last 4 years, we demonstrate that our implementation of RBM improves the efficiency and effectiveness of the clinical oversight process as measured on various quality, timeline, and cost dimensions. IMPLICATIONS: These results provide strong evidence that our RBM methodology can significantly improve the clinical oversight process and do so at a lower cost through more intelligent deployment of monitoring resources to the sites that need the most attention.


Subject(s)
Clinical Trials as Topic , Data Accuracy , Guideline Adherence , Humans , Patient Safety , Risk
6.
Alzheimers Dement (Amst) ; 1(3): 339-348, 2015 Sep 01.
Article in English | MEDLINE | ID: mdl-26693175

ABSTRACT

INTRODUCTION: The dynamic range of cerebrospinal fluid (CSF) amyloid ß (Aß1-42) measurement does not parallel to cognitive changes in Alzheimer's disease (AD) and cognitively normal (CN) subjects across different studies. Therefore, identifying novel proteins to characterize symptomatic AD samples is important. METHODS: Proteins were profiled using a multianalyte platform by Rules Based Medicine (MAP-RBM). Due to underlying heterogeneity and unbalanced sample size, we combined subjects (344 AD and 325 CN) from three cohorts: Alzheimer's Disease Neuroimaging Initiative, Penn Center for Neurodegenerative Disease Research of the University of Pennsylvania, and Knight Alzheimer's Disease Research Center at Washington University in St. Louis. We focused on samples whose cognitive and amyloid status was consistent. We performed linear regression (accounted for age, gender, number of APOE e4 alleles, and cohort variable) to identify amyloid-related proteins for symptomatic AD subjects in this largest ever CSF-based MAP-RBM study. ANOVA and Tukey's test were used to evaluate if these proteins were related to cognitive impairment changes as measured by mini-mental state examination (MMSE). RESULTS: Seven proteins were significantly associated with Aß1-42 levels in the combined cohort (false discovery rate adjusted P < .05), of which lipoprotein a (Lp(a)), prolactin (PRL), resistin, and vascular endothelial growth factor (VEGF) have consistent direction of associations across every individual cohort. VEGF was strongly associated with MMSE scores, followed by pancreatic polypeptide and immunoglobulin A (IgA), suggesting they may be related to staging of AD. DISCUSSION: Lp(a), PRL, IgA, and tissue factor/thromboplastin have never been reported for AD diagnosis in previous individual CSF-based MAP-RBM studies. Although some of our reported analytes are related to AD pathophysiology, others' roles in symptomatic AD samples worth further explorations.

7.
Pac Symp Biocomput ; : 364-75, 2014.
Article in English | MEDLINE | ID: mdl-24297562

ABSTRACT

Complex diseases such as major depression affect people over time in complicated patterns. Longitudinal data analysis is thus crucial for understanding and prognosis of such diseases and has received considerable attention in the biomedical research community. Traditional classification and regression methods have been commonly applied in a simple (controlled) clinical setting with a small number of time points. However, these methods cannot be easily extended to the more general setting for longitudinal analysis, as they are not inherently built for time-dependent data. Functional regression, in contrast, is capable of identifying the relationship between features and outcomes along with time information by assuming features and/or outcomes as random functions over time rather than independent random variables. In this paper, we propose a novel sparse generalized functional linear model for the prediction of treatment remission status of the depression participants with longitudinal features. Compared to traditional functional regression models, our model enables high-dimensional learning, smoothness of functional coefficients, longitudinal feature selection and interpretable estimation of functional coefficients. Extensive experiments have been conducted on the Sequenced Treatment Alternatives to Relieve Depression (STAR*D) data set and the results show that the proposed sparse functional regression method achieves significantly higher prediction power than existing approaches.


Subject(s)
Depression , Clinical Trials as Topic/statistics & numerical data , Computational Biology , Data Mining/statistics & numerical data , Databases, Factual/statistics & numerical data , Depression/therapy , Depressive Disorder, Major/therapy , Humans , Linear Models , Longitudinal Studies , Models, Psychological , Prognosis
8.
Br J Clin Pharmacol ; 75(1): 146-61, 2013 Jan.
Article in English | MEDLINE | ID: mdl-22534009

ABSTRACT

AIM: The objective is to develop a semi-mechanistic disease progression model for mild cognitive impairment (MCI) subjects. The model aims to describe the longitudinal progression of ADAS-cog scores from the Alzheimer's disease neuroimaging initiative trial that had data from 198 MCI subjects with cerebrospinal fluid (CSF) information who were followed for 3 years. METHOD: Various covariates were tested on disease progression parameters and these variables fell into six categories: imaging volumetrics, biochemical, genetic, demographic, cognitive tests and CSF biomarkers. RESULTS: CSF biomarkers were associated with both baseline disease score and disease progression rate in subjects with MCI. Baseline disease score was also correlated with atrophy measured using hippocampal volume. Progression rate was also predicted by executive functioning as measured by the Trail B-test. CONCLUSION: CSF biomarkers have the ability to discriminate MCI subjects into sub-populations that exhibit markedly different rates of disease progression on the ADAS-cog scale. These biomarkers can therefore be utilized for designing clinical trials enriched with subjects that carry the underlying disease pathology.


Subject(s)
Biomarkers/cerebrospinal fluid , Cognitive Dysfunction/cerebrospinal fluid , Aged , Aged, 80 and over , Alzheimer Disease/cerebrospinal fluid , Apolipoproteins E/genetics , Cholesterol/blood , Disease Progression , Female , Humans , Male , Middle Aged , Neuroimaging
9.
Alzheimers Dement ; 9(1 Suppl): S21-31, 2013 Feb.
Article in English | MEDLINE | ID: mdl-23127469

ABSTRACT

BACKGROUND: The Alzheimer's Disease Assessment Scale-Cognitive (ADAS-Cog) has been used widely as a cognitive end point in Alzheimer's Disease (AD) clinical trials. Efforts to treat AD pathology at earlier stages have also used ADAS-Cog, but failure in these trials can be difficult to interpret because the scale has well-known ceiling effects that limit its use in mild cognitive impairment (MCI) and early AD. A wealth of data exists in ADAS-Cog from both historical trials and contemporary longitudinal natural history studies that can provide insights about parts of the scale that may be better suited for MCI and early AD trials. METHODS: Using Alzheimer's Disease Neuroimaging Initiative study data, we identified the most informative cognitive measures from the ADAS-Cog and other available scales. We used cross-sectional analyses to characterize trajectories of ADAS-Cog and its individual subscales, as well as other cognitive, functional, or global measures across disease stages. Informative measures were identified based on standardized mean of 2-year change from baseline and were combined into novel composite endpoints. We assessed performance of the novel endpoints based on sample size requirements for a 2-year clinical trial. A bootstrap validation procedure was also undertaken to assess the reproducibility of the standardized mean changes of the selected measures and the corresponding composites. RESULTS: All proposed novel endpoints have improved standardized mean changes and thus improved statistical power compared with the ADAS-Cog 11. Further improvements were achieved by using cognitive-functional composites. Combining the novel composites with an enrichment strategy based on cerebral spinal fluid beta-amyloid (Aß(1-42)) in a 2-year trial yielded gains in power of 20% to 40% over ADAS-Cog 11, regardless of the novel measure considered. CONCLUSION: An empirical, data-driven approach with existing instruments was used to derive novel composite scales based on ADAS-Cog 11 with improved performance characteristics for MCI and early AD clinical trials. Together with patient enrichment based on Aß(1-42) pathology, these modified endpoints may allow more efficient clinical trials in these populations and can be assessed without modifying current test administration procedures in ongoing trials.


Subject(s)
Alzheimer Disease/diagnosis , Cognitive Dysfunction/diagnosis , Early Diagnosis , Aged , Alzheimer Disease/psychology , Clinical Trials as Topic , Female , Humans , Male , Neuropsychological Tests
10.
BMC Neurol ; 12: 46, 2012 Jun 25.
Article in English | MEDLINE | ID: mdl-22731740

ABSTRACT

BACKGROUND: Patients with Mild Cognitive Impairment (MCI) are at high risk of progression to Alzheimer's dementia. Identifying MCI individuals with high likelihood of conversion to dementia and the associated biosignatures has recently received increasing attention in AD research. Different biosignatures for AD (neuroimaging, demographic, genetic and cognitive measures) may contain complementary information for diagnosis and prognosis of AD. METHODS: We have conducted a comprehensive study using a large number of samples from the Alzheimer's Disease Neuroimaging Initiative (ADNI) to test the power of integrating various baseline data for predicting the conversion from MCI to probable AD and identifying a small subset of biosignatures for the prediction and assess the relative importance of different modalities in predicting MCI to AD conversion. We have employed sparse logistic regression with stability selection for the integration and selection of potential predictors. Our study differs from many of the other ones in three important respects: (1) we use a large cohort of MCI samples that are unbiased with respect to age or education status between case and controls (2) we integrate and test various types of baseline data available in ADNI including MRI, demographic, genetic and cognitive measures and (3) we apply sparse logistic regression with stability selection to ADNI data for robust feature selection. RESULTS: We have used 319 MCI subjects from ADNI that had MRI measurements at the baseline and passed quality control, including 177 MCI Non-converters and 142 MCI Converters. Conversion was considered over the course of a 4-year follow-up period. A combination of 15 features (predictors) including those from MRI scans, APOE genotyping, and cognitive measures achieves the best prediction with an AUC score of 0.8587. CONCLUSIONS: Our results demonstrate the power of integrating various baseline data for prediction of the conversion from MCI to probable AD. Our results also demonstrate the effectiveness of stability selection for feature selection in the context of sparse logistic regression.


Subject(s)
Alzheimer Disease/diagnosis , Alzheimer Disease/etiology , Cognitive Dysfunction/complications , Cognitive Dysfunction/diagnosis , Decision Support Systems, Clinical , Diagnosis, Computer-Assisted/methods , Aged , Algorithms , Artificial Intelligence , Female , Humans , Male , Proportional Hazards Models , Reproducibility of Results , Sensitivity and Specificity
11.
J Alzheimers Dis ; 31(3): 507-16, 2012.
Article in English | MEDLINE | ID: mdl-22614878

ABSTRACT

One of the challenges in developing a viable therapy for Alzheimer's disease has been demonstrating efficacy within a clinical trial. Using this as motivation, we sought to re-examine conventional clinical trial practices in order to determine whether efficacy can be better shown through alternative trial designs and novel analysis methods. In this work, we hypothesize that the confounding factors which hamper the ability to discern a treatment signal are the variability in observations as well as the insidious nature of the disease. We demonstrate that a two-phase trial design in which drug dosing is administered after a certain level of disease severity has been reached, coupled with a method to account more accurately for the progression of the disease, may allow us to compensate for these factors, and thus enable us to make treatment effects more apparent. Utilizing data from two previously failed trials which involved the evaluation of galantamine for indication in mild cognitive impairment, we were able to demonstrate that a clear treatment effect can be realized through both visual and statistical means, and propose that future trials may be more likely to show success if similar methods are utilized.


Subject(s)
Alzheimer Disease/drug therapy , Alzheimer Disease/pathology , Clinical Trials as Topic/methods , Galantamine/therapeutic use , Nootropic Agents/therapeutic use , Research Design , Alzheimer Disease/psychology , Clinical Trials as Topic/standards , Disease Progression , Humans , Research Design/standards
12.
J Clin Pharmacol ; 52(5): 629-44, 2012 May.
Article in English | MEDLINE | ID: mdl-21659625

ABSTRACT

The objective of this analysis was to develop a semi-mechanistic nonlinear disease progression model using an expanded set of covariates that captures the longitudinal change of Alzheimer's Disease Assessment Scale (ADAS-cog) scores from the Alzheimer's Disease Neuroimaging Initiative study that consisted of 191 Alzheimer disease patients who were followed for 2 years. The model describes the rate of progression and baseline disease severity as a function of influential covariates. The covariates that were tested fell into 4 categories: (1) imaging volumetric measures, (2) serum biomarkers, (3) demographic and genetic factors, and (4) baseline cognitive tests. Covariates found to affect baseline disease status were years since disease onset, hippocampal volume, and ventricular volume. Disease progression rate in the model was influenced by age, total cholesterol, APOE ε4 genotype, Trail Making Test (part B) score, and current levels of impairment as measured by ADAS-cog. Rate of progression was slower for mild and severe Alzheimer patients compared with moderate Alzheimer patients who exhibited faster rates of deterioration. In conclusion, this model describes disease progression in Alzheimer patients using novel covariates that are important for understanding the worsening of ADAS-cog scores over time and may be useful in the future for optimizing study designs through clinical trial simulations.


Subject(s)
Alzheimer Disease/diagnosis , Brain/pathology , Models, Biological , Neuroimaging , Age Factors , Aged , Aged, 80 and over , Alzheimer Disease/blood , Alzheimer Disease/genetics , Alzheimer Disease/pathology , Alzheimer Disease/physiopathology , Alzheimer Disease/psychology , Apolipoprotein E4/genetics , Biomarkers/blood , Brain/physiopathology , Cholesterol/blood , Cognition , Databases, Factual , Disease Progression , Female , Genetic Predisposition to Disease , Humans , Linear Models , Male , Middle Aged , Neuropsychological Tests , Nonlinear Dynamics , Phenotype , Predictive Value of Tests , Reproducibility of Results , Risk Assessment , Risk Factors , Severity of Illness Index , Time Factors
13.
J Chem Inf Model ; 51(12): 3113-30, 2011 Dec 27.
Article in English | MEDLINE | ID: mdl-22035187

ABSTRACT

Efficient substructure searching is a key requirement for any chemical information management system. In this paper, we describe the substructure search capabilities of ABCD, an integrated drug discovery informatics platform developed at Johnson & Johnson Pharmaceutical Research & Development, L.L.C. The solution consists of several algorithmic components: 1) a pattern mapping algorithm for solving the subgraph isomorphism problem, 2) an indexing scheme that enables very fast substructure searches on large structure files, 3) the incorporation of that indexing scheme into an Oracle cartridge to enable querying large relational databases through SQL, and 4) a cost estimation scheme that allows the Oracle cost-based optimizer to generate a good execution plan when a substructure search is combined with additional constraints in a single SQL query. The algorithm was tested on a public database comprising nearly 1 million molecules using 4,629 substructure queries, the vast majority of which were submitted by discovery scientists over the last 2.5 years of user acceptance testing of ABCD. 80.7% of these queries were completed in less than a second and 96.8% in less than ten seconds on a single CPU, while on eight processing cores these numbers increased to 93.2% and 99.7%, respectively. The slower queries involved extremely generic patterns that returned the entire database as screening hits and required extensive atom-by-atom verification.


Subject(s)
Algorithms , Drug Discovery , Informatics/methods , Small Molecule Libraries/chemistry , Databases, Factual , Drug Discovery/economics , Informatics/economics , Time Factors
14.
J Alzheimers Dis ; 26(4): 745-53, 2011.
Article in English | MEDLINE | ID: mdl-21694449

ABSTRACT

Hypothetical models of AD progression typically relate clinical stages of AD to sequential changes in CSF biomarkers, imaging, and cognition. However, quantifying the continuous trajectories proposed by these models over time is difficult because of the difficulty in relating the dynamics of different biomarkers during a clinical trial that is significantly shorter than the duration of the disease. We seek to show that through proper synchronization, it is possible to de-convolve these trends and quantify the periods of time associated with different pathophysiological changes associated with Alzheimer's disease (AD). We developed a model that replicated the observed progression of ADAS-Cog 13 scores and used this as a more precise estimate of disease-duration and thus pathologic stage. We then synchronized cerebrospinal fluid (CSF) and imaging biomarkers according to our new disease timeline. By de-convolving disease progression via ADAS-Cog 13, we were able to confirm the predictions of previous hypothetical models of disease progression as well as establish concrete timelines for different pathobiological events. Specifically, our work supports a sequential pattern of biomarker changes in AD in which reduction in CSF Aß(42) and brain atrophy precede the increases in CSF tau and phospho-tau.


Subject(s)
Alzheimer Disease/physiopathology , Aged , Aged, 80 and over , Algorithms , Alzheimer Disease/metabolism , Alzheimer Disease/psychology , Amyloid beta-Peptides/cerebrospinal fluid , Apolipoproteins E/genetics , Atrophy , Biomarkers , Cognition , Databases, Factual , Demography , Disease Progression , Female , Humans , Magnetic Resonance Imaging , Male , Middle Aged , Neuropsychological Tests , Peptide Fragments/cerebrospinal fluid , Positron-Emission Tomography , tau Proteins/cerebrospinal fluid
16.
J Chem Inf Model ; 47(6): 1999-2014, 2007.
Article in English | MEDLINE | ID: mdl-17973472

ABSTRACT

We present ABCD, an integrated drug discovery informatics platform developed at Johnson & Johnson Pharmaceutical Research & Development, L.L.C. ABCD is an attempt to bridge multiple continents, data systems, and cultures using modern information technology and to provide scientists with tools that allow them to analyze multifactorial SAR and make informed, data-driven decisions. The system consists of three major components: (1) a data warehouse, which combines data from multiple chemical and pharmacological transactional databases, designed for supreme query performance; (2) a state-of-the-art application suite, which facilitates data upload, retrieval, mining, and reporting, and (3) a workspace, which facilitates collaboration and data sharing by allowing users to share queries, templates, results, and reports across project teams, campuses, and other organizational units. Chemical intelligence, performance, and analytical sophistication lie at the heart of the new system, which was developed entirely in-house. ABCD is used routinely by more than 1000 scientists around the world and is rapidly expanding into other functional areas within the J&J organization.


Subject(s)
Biology , Computational Biology , Computers , Imaging, Three-Dimensional
17.
J Med Chem ; 50(24): 5926-37, 2007 Nov 29.
Article in English | MEDLINE | ID: mdl-17958407

ABSTRACT

We present structure-activity relationship (SAR) maps, a new, intuitive method for visualizing SARs targeted specifically at medicinal chemists. The method renders an R-group decomposition of a chemical series as a rectangular matrix of cells, each representing a unique combination of R-groups and thus a unique compound. Color-coding the cells by chemical property or biological activity allows patterns to be easily identified and exploited. SAR maps allow the medicinal chemist to interactively analyze complicated datasets with multiple R-group dimensions, rapidly correlate substituent structure and biological activity, assess additivity of substituent effects, identify missing analogs and screening data, and create compelling graphical representations for presentation and publication. We believe that this method fills a long-standing gap in the medicinal chemist's toolset for understanding and rationalizing SAR.


Subject(s)
Drug Design , Structure-Activity Relationship , CDC2 Protein Kinase/antagonists & inhibitors , Chemistry, Pharmaceutical , Models, Molecular , Molecular Conformation , Piperazines/chemistry , Piperidines/chemistry , Stereoisomerism , Triazoles/chemistry , Vascular Endothelial Growth Factor Receptor-2/antagonists & inhibitors
18.
J Chem Inf Model ; 47(1): 69-75, 2007.
Article in English | MEDLINE | ID: mdl-17238250

ABSTRACT

A new radial space-filling method for visualizing cluster hierarchies is presented. The method, referred to as a radial clustergram, arranges the clusters into a series of layers, each representing a different level of the tree. It uses adjacency of nodes instead of links to represent parent-child relationships and allocates sufficient screen real estate to each node to allow effective visualization of cluster properties through color-coding. Radial clustergrams combine the most appealing features of other cluster visualization techniques but avoid their pitfalls. Compared to classical dendrograms and hyperbolic trees, they make much more efficient use of space; compared to treemaps, they are more effective in conveying hierarchical structure and displaying properties of nodes higher in the tree. A fisheye lens is used to focus on areas of interest, without losing sight of the global context. The utility of the method is demonstrated using examples from the fields of molecular diversity and conformational analysis.


Subject(s)
Computer Graphics , Molecular Conformation , Classification/methods , Cluster Analysis
19.
J Biomol Screen ; 11(7): 854-63, 2006 Oct.
Article in English | MEDLINE | ID: mdl-16943390

ABSTRACT

The genomics revolution has unveiled a wealth of poorly characterized proteins. Scientists are often able to produce milligram quantities of proteins for which function is unknown or hypothetical, based only on very distant sequence homology. Broadly applicable tools for functional characterization are essential to the illumination of these orphan proteins. An additional challenge is the direct detection of inhibitors of protein-protein interactions (and allosteric effectors). Both of these research problems are relevant to, among other things, the challenge of finding and validating new protein targets for drug action. Screening collections of small molecules has long been used in the pharmaceutical industry as 1 method of discovering drug leads. Screening in this context typically involves a function-based assay. Given a sufficient quantity of a protein of interest, significant effort may still be required for functional characterization, assay development, and assay configuration for screening. Increasingly, techniques are being reported that facilitate screening for specific ligands for a protein of unknown function. Such techniques also allow for function-independent screening with better characterized proteins. ThermoFluor, a screening instrument based on monitoring ligand effects on temperature-dependent protein unfolding, can be applied when protein function is unknown. This technology has proven useful in the decryption of an essential bacterial enzyme and in the discovery of a series of inhibitors of a cancer-related, protein-protein interaction. The authors review some of the tools relevant to these research problems in drug discovery, and describe our experiences with 2 different proteins.


Subject(s)
Bacterial Proteins/analysis , Drug Evaluation, Preclinical/instrumentation , Drug Evaluation, Preclinical/methods , Fluorescent Dyes/analysis , Amino Acid Sequence , Bacterial Proteins/chemistry , Bacterial Proteins/metabolism , Humans , Ligands , Molecular Sequence Data , Protein Binding
20.
Proteins ; 57(4): 711-24, 2004 Dec 01.
Article in English | MEDLINE | ID: mdl-15476211

ABSTRACT

The problem of assigning a biochemical function to newly discovered proteins has been traditionally approached by expert enzymological analysis, sequence analysis, and structural modeling. In recent years, the appearance of databases containing protein-ligand interaction data for large numbers of protein classes and chemical compounds have provided new ways of investigating proteins for which the biochemical function is not completely understood. In this work, we introduce a method that utilizes ligand-binding data for functional classification of enzymes. The method makes use of the existing Enzyme Commission (EC) classification scheme and the data on interactions of small molecules with enzymes from the BRENDA database. A set of ligands that binds to an enzyme with unknown biochemical function serves as a query to search a protein-ligand interaction database for enzyme classes that are known to interact with a similar set of ligands. These classes provide hypotheses of the query enzyme's function and complement other computational annotations that take advantage of sequence and structural information. Similarity between sets of ligands is computed using point set similarity measures based upon similarity between individual compounds. We present the statistics of classification of the enzymes in the database by a cross-validation procedure and illustrate the application of the method on several examples.


Subject(s)
Enzymes/classification , Enzymes/metabolism , 5'-Nucleotidase , Aliivibrio fischeri/enzymology , Ligands , Protein Binding
SELECTION OF CITATIONS
SEARCH DETAIL
...