Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 4 de 4
Filter
Add more filters










Database
Language
Publication year range
1.
Mol Inform ; 38(8-9): e1800164, 2019 08.
Article in English | MEDLINE | ID: mdl-31322827

ABSTRACT

In this paper we used two sets of calculated molecular descriptors to predict blood-brain barrier (BBB) entry of a collection of 415 chemicals. The set of 579 descriptors were calculated by Schrodinger and TopoCluj software. Polly and Triplet software were used to calculate the second set of 198 descriptors. Following this, modelling and a two-deep, repeated external validation method was used for QSAR formulation. Results show that both sets of descriptors individually and their combination give models of reasonable prediction accuracy. We also uncover the effectiveness of a variable selection approach, by showing that for one of our descriptor sets, the top 5 % predictors in terms of random forest variable importance are able to provide a better performing model than the model with all predictors. The top influential descriptors indicate important aspects of molecular structural features that govern BBB entry of chemicals.


Subject(s)
Blood-Brain Barrier/metabolism , Machine Learning , Organic Chemicals/chemistry , Organic Chemicals/pharmacokinetics , Algorithms , Models, Molecular , Quantitative Structure-Activity Relationship , Software
2.
Curr Comput Aided Drug Des ; 12(3): 216-228, 2016.
Article in English | MEDLINE | ID: mdl-27222032

ABSTRACT

A large number of alignment-free techniques of graphical representation and numerical characterization (GRANCH) of bio-molecular sequences have been proposed in the recent past years, but the relative efficacy of these methods in determining the degree of similarities and dissimilarities of such sequences have not been ascertained. OBJECTIVE: Our objective is to make an assessment of the relative efficacy of these methods in determining the degree of similarities and dissimilarities of bio-molecular sequences. METHOD: We have chosen 7 published/communicated methods that represent various classes of GRANCH techniques and computed the descriptors that are expected to characterize similarities and dissimilarities in several sets of gene sequences. We critically appraise the different methods and determine which of these yield non-redundant structural information that could be used to compute different properties of the sequences, and which are correlated enough to one another so that using the simplest representative of the group would suffice. We also do a principal component analysis (PCA) to determine how the variances in the calculated sequence descriptors are explained by the computed principal components (PCs). RESULTS: We found that some of the descriptors are strongly correlated implying a commonality of structural information encoded by them while others are distinctly separate. The PCA results show that the first three PC's explain >97% of the variances. CONCLUSION: We found that some mathematical DNA descriptors calculated by a few of these techniques correlate strongly with one another implying a redundancy in the structural information quantified by those descriptors; others are not strongly correlated with one another suggesting that they encode non-redundant sequence information. From this and our PCA results, our recommendation would be to use minimally correlated set of descriptors or orthogonal descriptors like PCs derived from the descriptor set for the characterization of nucleic acid structure and function.


Subject(s)
DNA/genetics , RNA/genetics , Animals , Base Sequence , DNA/chemistry , Data Display , Exons , Humans , Principal Component Analysis , RNA/chemistry , Statistics as Topic , beta-Globins/genetics
3.
Curr Comput Aided Drug Des ; 9(4): 463-71, 2013 Dec.
Article in English | MEDLINE | ID: mdl-24138420

ABSTRACT

Interrelated Two-way Clustering (ITC) is an unsupervised clustering method developed to divide samples into two groups in gene expression data obtained through microarrays, selecting important genes simultaneously in the process. This has been found to be a better approach than conventional clustering methods like K-means or selforganizing map for the scenarios when number of samples is much smaller than number of variables (n«p). In this paper we used the ITC approach for classification of a diverse set of 508 chemicals regarding mutagenicity. A large number of topological indices (TIs), 3-dimensional, and quantum chemical descriptors, as well as atom pairs (APs) has been used as explanatory variables. In this paper, ITC has been used only for predictor selection, after which ridge regression is employed to build the final predictive model. The proper leave-one-out (LOO) method of cross-validation in this scenario is to take as holdout each of the 508 compounds before predictor thinning and compare the predicted values with the experimental data. ITC based results obtained here are comparable to those developed earlier.


Subject(s)
Gene Expression Profiling/methods , Models, Chemical , Models, Molecular , Cluster Analysis , Gene Expression , Humans , Molecular Structure , Mutagens/chemistry , Mutagens/toxicity , Oligonucleotide Array Sequence Analysis/methods , Quantitative Structure-Activity Relationship
4.
Curr Comput Aided Drug Des ; 6(4): 240-51, 2010 Dec.
Article in English | MEDLINE | ID: mdl-20883202

ABSTRACT

In this paper, calculated topological indices have been used to cluster a large virtual library of 125 psoralen derivatives into 25 clusters in an effort to select a subset of mutually dissimilar structures from a large collection of molecules. Inspection of the 25 structures, one closest to the respective centroid of each cluster, shows that the molecules are structurally more diverse as compared to a subset of 25 selected randomly. It is expected that such methods based on easily calculated descriptors may find applications in new drug discovery from the analysis of libraries of interesting lead compounds.


Subject(s)
Combinatorial Chemistry Techniques/methods , Drug Design , Furocoumarins/chemistry , Computational Biology , Computer-Aided Design , Drug Discovery/methods , Models, Chemical , Principal Component Analysis , Small Molecule Libraries
SELECTION OF CITATIONS
SEARCH DETAIL
...