Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 20
Filter
Add more filters










Publication year range
1.
Mol Pharm ; 10(11): 4378-90, 2013 Nov 04.
Article in English | MEDLINE | ID: mdl-24094040

ABSTRACT

BCS classification is a vital tool in the development of both generic and innovative drug products. The purpose of this work was to provisionally classify the world's top selling oral drugs according to the BCS, using in silico methods. Three different in silico methods were examined: the well-established group contribution (CLogP) and atom contribution (ALogP) methods, and a new method based solely on the molecular formula and element contribution (KLogP). Metoprolol was used as the benchmark for the low/high permeability class boundary. Solubility was estimated in silico using a thermodynamic equation that relies on the partition coefficient and melting point. The validity of each method was affirmed by comparison to reference data and literature. We then used each method to provisionally classify the orally administered, IR drug products found in the WHO Model list of Essential Medicines, and the top-selling oral drug products in the United States (US), Great Britain (GB), Spain (ES), Israel (IL), Japan (JP), and South Korea (KR). A combined list of 363 drugs was compiled from the various lists, and 257 drugs were classified using the different in silico permeability methods and literature solubility data, as well as BDDCS classification. Lastly, we calculated the solubility values for 185 drugs from the combined set using in silico approach. Permeability classification with the different in silico methods was correct for 69-72.4% of the 29 reference drugs with known human jejunal permeability, and for 84.6-92.9% of the 14 FDA reference drugs in the set. The correlations (r(2)) between experimental log P values of 154 drugs and their CLogP, ALogP and KLogP were 0.97, 0.82 and 0.71, respectively. The different in silico permeability methods produced comparable results: 30-34% of the US, GB, ES and IL top selling drugs were class 1, 27-36.4% were class 2, 22-25.5% were class 3, and 5.46-14% were class 4 drugs, while ∼8% could not be classified. The WHO list included significantly less class 1 and more class 3 drugs in comparison to the countries' lists, probably due to differences in commonly used drugs in developing vs industrial countries. BDDCS classified more drugs as class 1 compared to in silico BCS, likely due to the more lax benchmark for metabolism (70%), in comparison to the strict permeability benchmark (metoprolol). For 185 out of the 363 drugs, in silico solubility values were calculated, and successfully matched the literature solubility data. In conclusion, relatively simple in silico methods can be used to estimate both permeability and solubility. While CLogP produced the best correlation to experimental values, even KLogP, the most simplified in silico method that is based on molecular formula with no knowledge of molecular structure, produced comparable BCS classification to the sophisticated methods. This KLogP, when combined with a mean melting point and estimated dose, can be used to provisionally classify potential drugs from just molecular formula, even before synthesis. 49-59% of the world's top-selling drugs are highly soluble (class 1 and class 3), and are therefore candidates for waivers of in vivo bioequivalence studies. For these drugs, the replacement of expensive human studies with affordable in vitro dissolution tests would ensure their bioequivalence, and encourage the development and availability of generic drug products in both industrial and developing countries.


Subject(s)
Solubility , Israel , Japan , Metoprolol/chemistry , Republic of Korea , Spain , Thermodynamics , United Kingdom , United States
2.
J Biomol NMR ; 46(4): 281-98, 2010 Apr.
Article in English | MEDLINE | ID: mdl-20232231

ABSTRACT

Here we describe a new algorithm for automatically determining the mainchain sequential assignment of NMR spectra for proteins. Using only the customary triple resonance experiments, assignments can be quickly found for not only small proteins having rather complete data, but also for large proteins, even when only half the residues can be assigned. The result of the calculation is not the single best assignment according to some criterion, but rather a large number of satisfactory assignments that are summarized in such a way as to help the user identify portions of the sequence that are assigned with confidence, vs. other portions where the assignment has some correlated alternatives. Thus very imperfect initial data can be used to suggest future experiments.


Subject(s)
Algorithms , Electronic Data Processing , Magnetic Resonance Spectroscopy/methods , Proteins/chemistry , Sequence Analysis, Protein , Amino Acid Sequence , Animals , Computer Simulation , Humans , Models, Chemical , Molecular Sequence Data , Molecular Weight , Peptide Mapping , Sensitivity and Specificity , Ubiquitin/chemistry
3.
J Chem Inf Model ; 49(9): 2013-33, 2009 Sep.
Article in English | MEDLINE | ID: mdl-19702243

ABSTRACT

One of the most important physicochemical properties of small molecules and macromolecules are the dissociation constants for any weakly acidic or basic groups, generally expressed as the pK(a) of each group. This is a major factor in the pharmacokinetics of drugs and in the interactions of proteins with other molecules. For both the protein and small molecule cases, we survey the sources of experimental pK(a) values and then focus on current methods for predicting them. Of particular concern is an analysis of the scope, statistical validity, and predictive power of methods as well as their accuracy.


Subject(s)
Chemical Phenomena , Animals , Humans , Hydrogen-Ion Concentration , Models, Molecular , Proteins/chemistry , Proteins/metabolism , Quantum Theory , Static Electricity
4.
Comput Biol Chem ; 33(5): 357-60, 2009 Oct.
Article in English | MEDLINE | ID: mdl-19699687

ABSTRACT

A simple, easily calculated, nonparametric statistic is described that can detect the presence of a functional relationship in bivariate data. Given a sample of data points (x,y), the statistic's value is nearly 1 if y is a linear function of x with little noise; it is greater than 1 if y is a nonlinear function of x; and it is close to 2 if x and y are uniformly and independently distributed. The statistic can be used to rapidly screen through large data sets to identify the most functionally related variable pairs. As an illustration, the statistic is used to detect relations between polypeptide conformational energy and functions of a series expansion for chain conformations.


Subject(s)
Models, Statistical , Peptides/chemistry , Algorithms , Computer Simulation , Protein Conformation , Thermodynamics
5.
J Chem Inf Model ; 48(10): 2042-53, 2008 Oct.
Article in English | MEDLINE | ID: mdl-18826209

ABSTRACT

Realizing favorable absorption, distribution, metabolism, elimination, and toxicity profiles is a necessity due to the high attrition rate of lead compounds in drug development today. The ability to accurately predict bioavailability can help save time and money during the screening and optimization processes. As several robust programs already exist for predicting logP, we have turned our attention to the fast and robust prediction of pK(a) for small molecules. Using curated data from the Beilstein Database and Lange's Handbook of Chemistry, we have created a decision tree based on a novel set of SMARTS strings that can accurately predict the pK(a) for monoprotic compounds with R(2) of 0.94 and root mean squared error of 0.68. Leave-some-out (10%) cross-validation achieved Q(2) of 0.91 and root mean squared error of 0.80.


Subject(s)
Pharmaceutical Preparations/chemistry , Small Molecule Libraries , Software , Algorithms , Cluster Analysis , Data Interpretation, Statistical , Decision Trees , Drug Evaluation, Preclinical , Forecasting , Kinetics , Models, Molecular , Peptide Mapping , Principal Component Analysis , Reproducibility of Results , Structure-Activity Relationship , Subject Headings
6.
J Chem Inf Model ; 48(7): 1379-88, 2008 Jul.
Article in English | MEDLINE | ID: mdl-18588283

ABSTRACT

Elimination of cytotoxic compounds in the early and later stages of drug discovery can help reduce the costs of research and development. Through the application of principal components analysis (PCA), we were able to data mine and prove that approximately 89% of the total log GI 50 variance is due to the nonspecific cytotoxic nature of substances. Furthermore, PCA led to the identification of groups of structurally unrelated substances showing very specific toxicity profiles, such as a set of 45 substances toxic only to the Leukemia_SR cancer cell line. In an effort to predict nonspecific cytotoxicity on the basis of the mean log GI 50, we created a decision tree using MACCS keys that can correctly classify over 83% of the substances as cytotoxic/noncytotoxic in silico, on the basis of the cutoff of mean log GI 50 = -5.0. Finally, we have established a linear model using least-squares in which nine of the 59 available NCI60 cancer cell lines can be used to predict the mean log GI 50. The model has R (2) = 0.99 and a root-mean-square deviation between the observed and calculated mean log GI 50 (RMSE) = 0.09. Our predictive models can be applied to flag generally cytotoxic molecules in virtual and real chemical libraries, thus saving time and effort.


Subject(s)
Drug Screening Assays, Antitumor , Information Storage and Retrieval , Cell Line, Tumor , Decision Trees , Humans
7.
J Chem Inf Model ; 47(6): 2063-76, 2007.
Article in English | MEDLINE | ID: mdl-17915856

ABSTRACT

The NCI Developmental Therapeutics Program Human Tumor cell line data set is a publicly available database that contains cellular assay screening data for over 40 000 compounds tested in 60 human tumor cell lines. The database also contains microarray assay gene expression data for the cell lines, and so it provides an excellent information resource particularly for testing data mining methods that bridge chemical, biological, and genomic information. In this paper we describe a formal knowledge discovery approach to characterizing and data mining this set and report the results of some of our initial experiments in mining the set from a chemoinformatics perspective.


Subject(s)
Databases, Factual , Antineoplastic Agents/chemistry , Antineoplastic Agents/pharmacology , Cell Line, Tumor , Cell Proliferation/drug effects , Drug Evaluation, Preclinical , Gene Expression Regulation, Neoplastic/drug effects , Humans , Molecular Structure , National Cancer Institute (U.S.) , Neoplasms/pathology , United States
8.
J Comput Biol ; 13(9): 1565-73, 2006 Nov.
Article in English | MEDLINE | ID: mdl-17147479

ABSTRACT

Recently, we developed a pairwise structural alignment algorithm using realistic structural and environmental information (SAUCE). In this paper, we at first present an automatic fold hierarchical classification based on SAUCE alignments. This classification enables us to build a fold tree containing different levels of multiple structural profiles. Then a tree-based fold search algorithm is described. We applied this method to a group of structures with sequence identity less than 35% and did a series of leave one out tests. These tests are approximately comparable to fold recognition tests on superfamily level. Results show that fold recognition via a fold tree can be faster and better at detecting distant homologues than classic fold recognition methods.


Subject(s)
Models, Statistical , Protein Folding , Proteins/chemistry , Sequence Alignment/statistics & numerical data , Algorithms , Artificial Intelligence , Biometry , Databases, Protein , Models, Molecular , Proteins/genetics , Proteomics/statistics & numerical data
9.
Bioinformatics ; 22(17): 2087-93, 2006 Sep 01.
Article in English | MEDLINE | ID: mdl-16809393

ABSTRACT

MOTIVATION: Multiple STructural Alignment (MSTA) provides valuable information for solving problems such as fold recognition. The consistency-based approach tries to find conflict-free subsets of alignments from a pre-computed all-to-all Pairwise Alignment Library (PAL). If large proportions of conflicts exist in the library, consistency can be hard to get. On the other hand, multiple structural superposition has been used in many MSTA methods to refine alignments. However, multiple structural superposition is dependent on alignments, and a superposition generated based on erroneous alignments is not guaranteed to be the optimal superposition. Correcting errors after making errors is not as good as avoiding errors from the beginning. Hence it is important to refine the pairwise library to reduce the number of conflicts before any consistency-based assembly. RESULTS: We present an algorithm, Iterative Refinement of Induced Structural alignment (IRIS), to refine the PAL. A new measurement for the consistency of a library is also proposed. Experiments show that our algorithm can greatly improve T-COFFEE performance for less consistent pairwise alignment libraries. The final multiple alignment outperforms most state-of-the-art MSTA algorithms at assembling 15 transglycosidases. Results on three other benchmarks showed that the algorithm consistently improves multiple alignment performance. AVAILABILITY: The C++ code of the algorithm is available upon request.


Subject(s)
Algorithms , Proteins/chemistry , Sequence Alignment/methods , Sequence Analysis, Protein/methods , Software , Amino Acid Sequence , Artificial Intelligence , Molecular Sequence Data , Pattern Recognition, Automated/methods , Proteins/classification
10.
J Biomol NMR ; 33(4): 261-79, 2005 Dec.
Article in English | MEDLINE | ID: mdl-16341754

ABSTRACT

Rapid analysis of protein structure, interaction, and dynamics requires fast and automated assignments of 3D protein backbone triple-resonance NMR spectra. We introduce a new depth-first ordered tree search method of automated assignment, CASA, which uses hand-edited peak-pick lists of a flexible number of triple resonance experiments. The computer program was tested on 13 artificially simulated peak lists for proteins up to 723 residues, as well as on the experimental data for four proteins. Under reasonable tolerances, it generated assignments that correspond to the ones reported in the literature within a few minutes of CPU time. The program was also tested on the proteins analyzed by other methods, with both simulated and experimental peaklists, and it could generate good assignments in all relevant cases. The robustness was further tested under various situations.


Subject(s)
Algorithms , Nuclear Magnetic Resonance, Biomolecular/methods , Proteins/chemistry , Amino Acid Sequence , Computer Simulation , Electron Spin Resonance Spectroscopy , Molecular Sequence Data , Software
11.
Protein Sci ; 14(12): 2935-46, 2005 Dec.
Article in English | MEDLINE | ID: mdl-16260755

ABSTRACT

In the era of structural genomics, it is necessary to generate accurate structural alignments in order to build good templates for homology modeling. Although a great number of structural alignment algorithms have been developed, most of them ignore intermolecular interactions during the alignment procedure. Therefore, structures in different oligomeric states are barely distinguishable, and it is very challenging to find correct alignment in coil regions. Here we present a novel approach to structural alignment using a clique finding algorithm and environmental information (SAUCE). In this approach, we build the alignment based on not only structural coordinate information but also realistic environmental information extracted from biological unit files provided by the Protein Data Bank (PDB). At first, we eliminate all environmentally unfavorable pairings of residues. Then we identify alignments in core regions via a maximal clique finding algorithm. Two extreme value distribution (EVD) form statistics have been developed to evaluate core region alignments. With an optional extension step, global alignment can be derived based on environment-based dynamic programming linking. We show that our method is able to differentiate three-dimensional structures in different oligomeric states, and is able to find flexible alignments between multidomain structures without predetermined hinge regions. The overall performance is also evaluated on a large scale by comparisons to current structural classification databases as well as to other alignment methods.


Subject(s)
Computational Biology/methods , Proteins/chemistry , Structural Homology, Protein , Algorithms , Databases, Protein , Models, Molecular , Protein Structure, Tertiary
12.
Proteins ; 60(1): 82-9, 2005 Jul 01.
Article in English | MEDLINE | ID: mdl-15861390

ABSTRACT

Cluster distance geometry is a recent generalization of distance geometry whereby protein structures can be described at even lower levels of detail than one point per residue. With improvements in the clustering technique, protein conformations can be summarized in terms of alternative contact patterns between clusters, where each cluster contains four sequentially adjacent amino acid residues. A very simple potential function involving 210 adjustable parameters can be determined that favors the native contacts of 31 small, monomeric proteins over their respective sets of nonnative contacts. This potential then favors the native contacts for 174 small, monomeric proteins that have low sequence identity with any of the training set. A broader search finds 698 small protein chains from the Protein Data Bank where the native contacts are preferred over all alternatives, even though they have low sequence identity with the training set. This amounts to a highly predictive method for ab initio protein folding at low spatial resolution.


Subject(s)
Models, Statistical , Protein Folding , Proteins/chemistry , Algorithms , Animals , Humans
13.
Biopolymers ; 75(3): 278-89, 2004 Oct 15.
Article in English | MEDLINE | ID: mdl-15378485

ABSTRACT

This is our second type of model for protein folding where the configurational parameters and the effective potential energy function are chosen in such a way that all conformations are described and the canonical partition function can be evaluated analytically. Structure is described in terms of distances between pairs of sequentially contiguous blocks of eight residues, and all possible conformations are grouped into 71 subsets in terms of bounds on these distances. The energy is taken to be a sum of pairwise interactions between such blocks. The 210 energy parameters were adjusted so that the native folds of 32 small proteins are favored in free energy over the denatured state. We then found 146 proteins having negligible sequence similarity to any of the training proteins, yet the free energy of the respective correct native states were favored over the denatured state.


Subject(s)
Models, Theoretical , Protein Folding , Proteins/chemistry , Statistics as Topic/methods , Algorithms , Amino Acid Sequence , Amino Acids/chemistry , Cluster Analysis , Peptides/chemistry , Protein Conformation , Protein Denaturation , Thermodynamics
14.
J Comput Chem ; 25(10): 1305-12, 2004 Jul 30.
Article in English | MEDLINE | ID: mdl-15139043

ABSTRACT

Distance geometry has been a broadly useful tool for dealing with conformational calculations. Customarily each atom is represented as a point, constraints on the distances between some atoms are obtained from experimental or theoretical sources, and then a random sampling of conformations can be calculated that are consistent with the constraints. Although these methods can be applied to small proteins having on the order of 1000 atoms, for some purposes it is advantageous to view the problem at lower resolution. Here distance geometry is generalized to deal with distances between sets of points. In the end, much of the same techniques produce a sampling of different configurations of these sets of points subject to distance constraints, but now the radii of gyration of the different sets play an important role. A simple example is given of how the packing constraints for polypeptide chains combine with loose distance constraints to give good calculated protein conformers at a very low resolution.


Subject(s)
Algorithms , Peptides/chemistry , Protein Conformation , Cluster Analysis
15.
Biopolymers ; 74(3): 214-20, 2004 Jun 15.
Article in English | MEDLINE | ID: mdl-15150796

ABSTRACT

We have initiated an entirely new approach to statistical mechanical models of strongly interacting systems where the configurational parameters and the potential energy function are both constructed so that the canonical partition function can be evaluated analytically. For a simplified model of proteins consisting of a single, fairly short polypeptide chain without cross-links, we can adjust the energy parameters to favor the experimentally determined native state of seven proteins having diverse types of folds. Then 497 test proteins are predicted to have stable native folds, even though they are also structurally diverse, and 480 of them have no significant sequence similarity to any of the training proteins.


Subject(s)
Models, Theoretical , Protein Folding , Peptides/genetics , Peptides/metabolism , Proteins/chemistry , Proteins/metabolism , Thermodynamics
16.
Methods Mol Biol ; 275: 427-38, 2004.
Article in English | MEDLINE | ID: mdl-15141124

ABSTRACT

Given atomic coordinates for a particular conformation of a molecule and some property value assigned to each atom, one can easily calculate a chirality function that distinguishes enantiomers, is zero for an achiral molecule, and is a continuous function of the coordinates and properties. This is useful as a quantitative measure of chirality for molecular modeling and structure-activity relations.


Subject(s)
Molecular Conformation , Quantitative Structure-Activity Relationship , Stereoisomerism
17.
Mol Pharm ; 1(6): 434-46, 2004.
Article in English | MEDLINE | ID: mdl-16028355

ABSTRACT

Biphenyl hydrolase-like (BPHL) protein is a novel serine hydrolase which has been identified as human valacyclovirase (VACVase), catalyzing the hydrolytic activation of valine ester prodrugs of the antiviral drugs acyclovir and ganciclovir as well as other amino acid ester prodrugs of therapeutic nucleoside analogues. The broad specificity for nucleoside analogues as parent drugs suggests that BPHL may be particularly useful as a molecular target for prodrug activation. In order to develop an initial structural view of the specificity of BPHL, a homology model of BPHL based on the crystal structure of 2-hydroxy-6-oxo-7-methylocta-2,4-dienoate hydrolase was developed using the Molecular Operating Environment package (Chemical Computing Group, Montreal, Quebec), evaluated for its stereochemical quality and identification of free cysteines, and used in a molecular docking study. The BPHL model has residues S122, H255, and D227 comprising the putative catalytic triad in proximity and potential charge-charge interaction sites, M52 or D123 for the alpha-amino group. The model also suggested that the structural preference of BPHL for hydrophobic amino acyl promoieties and its limited activity for the secondary alcohol substrates may be attributed to the hydrophobic acyl-binding site formed by residues I158, G161, I162, and L229, and the spatial constraint around the catalytic site by a loop on one side, the active serine and histidine on the other side, and L53 and L179 on top. In addition, the broad specificity for nucleoside analogues may be due to the relatively less constrained nucleoside-binding site opening toward the entrance of the substrate-binding pocket. The homology model of BPHL provides a basis for further investigation of the catalytic and active site residues, can account for the observed structure activity profile of BPHL, and will be useful in the design of nucleoside prodrugs.


Subject(s)
Acyclovir/analogs & derivatives , Carboxylic Ester Hydrolases/chemistry , Protein Structure, Secondary , Valine/analogs & derivatives , Acyclovir/metabolism , Amino Acid Sequence , Carboxylic Ester Hydrolases/drug effects , Humans , Ligands , Models, Molecular , Molecular Sequence Data , Molecular Structure , Prodrugs/chemistry , Prodrugs/pharmacology , Sensitivity and Specificity , Sequence Alignment , Structure-Activity Relationship , Valacyclovir , Valine/metabolism
18.
J Chem Inf Comput Sci ; 43(2): 629-36, 2003.
Article in English | MEDLINE | ID: mdl-12653531

ABSTRACT

Adequate conformational searching of small molecules and inclusion of a chirality identifier are necessary features of any current technique for quantitative structure-activity relationships (QSAR). However, implementation of these features can be difficult and computationally expensive, and some techniques can still lead to insufficient treatment of molecular conformation. We select the standard systematic conformational search as the default search method for our recent 3D QSAR program, DAPPER, and develop a novel chirality metric for use in QSAR. These techniques are implemented in DAPPER and validated on standard data sets.


Subject(s)
Quantitative Structure-Activity Relationship , Software , Enzyme Inhibitors/chemistry , Enzyme Inhibitors/pharmacology , Inhibitory Concentration 50 , Molecular Conformation , Reproducibility of Results , Stereoisomerism
19.
J Mol Graph Model ; 21(3): 161-70, 2002 Dec.
Article in English | MEDLINE | ID: mdl-12463634

ABSTRACT

A novel set of molecular descriptors suitable for use in quantitative structure-activity relationships and related methods is described. These descriptors are a smooth and interpretable representation of atomic physicochemical property values and intramolecular atom pair distances. Distance atomic physicochemical parameter energy relationships (DAPPER), a novel structure-activity relationship (QSAR) method using these descriptors, is validated on standard datasets.


Subject(s)
Chemistry, Physical , Quantitative Structure-Activity Relationship , Steroids/metabolism , Transcortin/metabolism , Chemical Phenomena , Computer Simulation , Drug Design , Ligands , Models, Chemical , Models, Molecular , Models, Theoretical , Molecular Structure , Protein Binding , Steroids/chemistry , Steroids/pharmacology , Transcortin/chemistry , Transcortin/pharmacology
20.
BMC Struct Biol ; 2: 4, 2002 Aug 06.
Article in English | MEDLINE | ID: mdl-12165098

ABSTRACT

BACKGROUND: We present a simple method to train a potential function for the protein folding problem which, even though trained using a small number of proteins, is able to place a significantly large number of native conformations near a local minimum. The training relies on generating decoys by energy minimization of the native conformations using the current potential and using a physically meaningful objective function (derivative of energy with respect to torsion angles at the native conformation) during the quadratic programming to place the native conformation near a local minimum. RESULTS: We also compare the performance of three different types of energy functions and find that while the pairwise energy function is trainable, a solvation energy function by itself is untrainable if decoys are generated by minimizing the current potential starting at the native conformation. The best results are obtained when a pairwise interaction energy function is used with solvation energy function. CONCLUSIONS: We are able to train a potential function using six proteins which places a total of 42 native conformations within approximately 4 A rmsd and 71 native conformations within approximately 6 A rmsd of a local minimum out of a total of 91 proteins. Furthermore, the threading test using the same 91 proteins ranks 89 native conformations to be first and the other two as second.


Subject(s)
Models, Molecular , Protein Conformation , Molecular Structure , Protein Folding , Proteins/chemistry , Solvents/chemistry
SELECTION OF CITATIONS
SEARCH DETAIL
...