Search | VHL Regional Portal

Parameterization and conformational sampling effects in pharmacophore multiplet searching.

Fox, Peter C; Wolohan, Philippa R N; Abrahamian, Edmond; Clark, Robert D.

J Chem Inf Model ; 48(12): 2326-34, 2008 Dec.

Article in English | MEDLINE | ID: mdl-19053520

ABSTRACT

Pharmacophore patterns in ligands can be effectively characterized in terms of their constituent pharmacophore multiplets. Bitsets (fingerprints) encoding which particular multiplets are found in a given ligand have been and continue to be used as molecular descriptors in a range of molecular modeling applications, from ligand alignment and diversity analysis to pharmacophore-based flexible searching. Being able to create, store, and manipulate multiplets in compressed form - as bitmaps - has made it possible to integrate them into high-throughput technologies. A number of key parameters affect how well multiplets perform, including the granularity of edge length binning; how different multiplets are weighted in creating hypotheses from multiple ligands; and the number of bits that should be included in a pharmacophore hypothesis. The similarity metric employed for bitmap comparisons also affects search performance, as does the conformational sampling regime used for characterizing flexible molecules. In this report we explore the effect of parameter variation on within- and between-class similarity across seven different pharmacological classes and introduce a new measure of molecular similarity - the asymmetric stochastic cosine - uniquely suited to searching a database for matches to query hypotheses deduced from multiple ligands. Surprisingly, it turns out that the most discriminating bitmaps are obtained using relatively few conformers. The extreme discrimination power seen for single conformers, however, seems to reflect consistent effects of 2D connectivity on the 3D structure obtained. Conformational sampling by systematic search reinforces such circumstantial discrimination and should be avoided. The potential for systematic bias becomes clear when the behavior of otherwise similar conformational ensembles created by local energy minimization or by random sampling is considered. Consolidating information from multiple known actives or establishing single "bioactive" conformations a priori are safer ways to improve discrimination in pharmacophoric multiplet searching.

Subject(s)

Drug Design , Binding Sites , Computer Simulation , Databases, Factual , Ligands , Molecular Conformation , Molecular Structure

Statistical variation in progressive scrambling.

Clark, Robert D; Fox, Peter C.

J Comput Aided Mol Des ; 18(7-9): 563-76, 2004.

Article in English | MEDLINE | ID: mdl-15729855

ABSTRACT

The two methods most often used to evaluate the robustness and predictivity of partial least squares (PLS) models are cross-validation and response randomization. Both methods may be overly optimistic for data sets that contain redundant observations, however. The kinds of perturbation analysis widely used for evaluating model stability in the context of ordinary least squares regression are only applicable when the descriptors are independent of each other and errors are independent and normally distributed; neither assumption holds for QSAR in general and for PLS in particular. Progressive scrambling is a novel, nonparametric approach to perturbing models in the response space in a way that does not disturb the underlying covariance structure of the data. Here, we introduce adjustments for two of the characteristic values produced by a progressive scrambling analysis - the deprecated predictivity (Q*2s) and standard error of prediction (SDEPs*) - that correct for the effect of introduced perturbation. We also explore the statistical behavior of the adjusted values (Q*2(0) and SDEP0*) and the sensitivity to perturbation (dq2/dryy'2). It is shown that the three statistics are all robust for stable PLS models, in terms of the stochastic component of their determination and of their variation due to sampling effects involved in training set selection.

Subject(s)

Data Interpretation, Statistical , Quantitative Structure-Activity Relationship , Least-Squares Analysis

Efficient generation, storage, and manipulation of fully flexible pharmacophore multiplets and their use in 3-D similarity searching.

Abrahamian, Edmond; Fox, Peter C; Naerum, Lars; Christensen, Inge Thøger; Thøgersen, Henning; Clark, Robert D.

J Chem Inf Comput Sci ; 43(2): 458-68, 2003.

Article in English | MEDLINE | ID: mdl-12653509

ABSTRACT

Pharmacophore triplets and quartets have been used by many groups in recent years, primarily as a tool for molecular diversity analysis. In most cases, slow processing speeds and the very large size of the bitsets generated have forced researchers to compromise in terms of how such multiplets were stored, manipulated, and compared, e.g., by using simple unions to represent multiplets for sets of molecules. Here we report using bitmaps in place of bitsets to reduce storage demands and to improve processing speed. Here, a bitset is taken to mean a fully enumerated string of zeros and ones, from which a compressed bitmap is obtained by replacing uniform blocks ("runs") of digits in the bitset with a pair of values identifying the content and length of the block (run-length encoding compression). High-resolution multiplets involving four features are enabled by using 64 bit executables to create and manipulate bitmaps, which "connect" to the 32 bit executables used for database access and feature identification via an extensible mark-up language (XML) data stream. The encoding system used supports simple pairs, triplets, and quartets; multiplets in which a privileged substructure is used as an anchor point; and augmented multiplets in which an additional vertex is added to represent a contingent feature such as a hydrogen bond extension point linked to a complementary feature (e.g., a donor or an acceptor atom) in a base pair or triplet. It can readily be extended to larger, more complex multiplets as well. Database searching is one particular potential application for this technology. Consensus bitmaps built up from active ligands identified in preliminary screening can be used to generate hypothesis bitmaps, a process which includes allowance for differential weighting to allow greater emphasis to be placed on bits arising from multiplets expected to be particularly discriminating. Such hypothesis bitmaps are shown to be useful queries for database searching, successfully retrieving active compounds across a range of structural classes from a corporate database. The current implementation allows multiconformer bitmaps to be obtained from pregenerated conformations or by random perturbation on-the-fly. The latter application involves random sampling of the full range of conformations not precluded by steric clashes, which limits the usefulness of classical fingerprint similarity measures. A new measure of similarity, The Stochastic Cosine, is introduced here to address this need. This new similarity measure uses the average number of bits common to independently drawn conformer sets to normalize the cosine coefficient. Its use frees the user from having to ensure strict comparability of starting conformations and having to use fixed torsional increments, thereby allowing fully flexible characterization of pharmacophoric patterns.

Subject(s)

Information Storage and Retrieval , Pharmacology/methods , Quantitative Structure-Activity Relationship , Adrenergic beta-Antagonists/chemistry , Adrenergic beta-Antagonists/pharmacology , Anti-Arrhythmia Agents/chemistry , Anti-Arrhythmia Agents/pharmacology , Benzamides/chemistry , Benzamides/pharmacology , Molecular Conformation , Phenothiazines/chemistry , Phenothiazines/pharmacology , Receptors, Estrogen/antagonists & inhibitors , Software , Stochastic Processes

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL