Pesquisa | Portal Regional da BVS (teste)

1.

Bayesian interpretation of a distance function for navigating high-dimensional descriptor spaces.

Vogt, Martin; Godden, Jeffrey W; Bajorath, Jürgen.

J Chem Inf Model ; 47(1): 39-46, 2007.

Artigo em Inglês | MEDLINE | ID: mdl-17238247

RESUMO

A distance function to analyze molecular similarity relationships in high-dimensional descriptor spaces and focus search calculations on "active subspaces" is defined in Bayesian terms. As a measure of similarity, database compounds are ranked according to their distance from the center of a subspace formed by known active molecules. From a Bayesian point of view, distance calculations are transformed into a "log-odds" estimate. Following this approach, maximizing the likelihood of a compound to be active corresponds to minimizing the distance from the center of an active subspace. Since the methodology generates a ranking of database molecules according to decreasing similarity to template compounds, it can be conveniently compared to similarity search tools, and the Bayesian function is found to compare favorably to two standard fingerprints in multiple template-based database searching.

Assuntos

Teorema de Bayes , Bases de Dados Factuais , Estrutura Molecular , Relação Estrutura-Atividade

2.

Assessment of molecular similarity from the analysis of randomly generated structural fragment populations.

Batista, José; Godden, Jeffrey W; Bajorath, Jürgen.

J Chem Inf Model ; 46(5): 1937-44, 2006.

Artigo em Inglês | MEDLINE | ID: mdl-16995724

RESUMO

A novel method termed MolBlaster is introduced for the evaluation of molecular similarity relationships on the basis of randomly generated fragment populations. Our motivation has been to develop a similarity method that does not depend on the use of predefined structural or property descriptors. Fragment profiles of molecules are generated by random deletion of bonds in connectivity tables and quantitatively compared using entropy-based metrics. In test calculations, MolBlaster accurately reproduced a structural key-based similarity ranking of druglike molecules.

Assuntos

Estrutura Molecular , Entropia

3.

A distance function for retrieval of active molecules from complex chemical space representations.

Godden, Jeffrey W; Bajorath, Jürgen.

J Chem Inf Model ; 46(3): 1094-7, 2006.

Artigo em Inglês | MEDLINE | ID: mdl-16711729

RESUMO

The concept of chemical space is of fundamental importance for chemoinformatics research. It is generally thought that high-dimensional space representations are too complex for the successful application of many compound classification or virtual screening methods. Here, we show that a simple "activity-centered" distance function is capable of accurately detecting molecular similarity relationships in "raw" chemical spaces of high dimensionality.

Assuntos

Biologia Computacional

4.

Anatomy of fingerprint search calculations on structurally diverse sets of active compounds.

Godden, Jeffrey W; Stahura, Florence L; Bajorath, Jürgen.

J Chem Inf Model ; 45(6): 1812-9, 2005.

Artigo em Inglês | MEDLINE | ID: mdl-16309288

RESUMO

Similarity searching using molecular fingerprints is a widely used approach for the identification of novel hits. A fingerprint search involves many pairwise comparisons of bit string representations of known active molecules with those precomputed for database compounds. Bit string overlap, as evaluated by various similarity metrics, is used as a measure of molecular similarity. Results of a number of studies focusing on fingerprints suggest that it is difficult, if not impossible, to develop generally applicable search parameters and strategies, irrespective of the compound classes under investigation. Rather, more or less, each individual search problem requires an adjustment of calculation conditions. Thus, there is a need for diagnostic tools to analyze fingerprint-based similarity searching. We report an analysis of fingerprint search calculations on different sets of structurally diverse active compounds. Calculations on five biological activity classes were carried out with two fingerprints in two compound source databases, and the results were analyzed in histograms. Tanimoto coefficient (Tc) value ranges where active compounds were detected were compared to the distribution of Tc values in the database. The analysis revealed that compound class-specific effects strongly influenced the outcome of these fingerprint calculations. Among the five diverse compound sets studied, very different search results were obtained. The analysis described here can be applied to determine Tc intervals where scaffold hopping occurs. It can also be used to benchmark fingerprint calculations or estimate their probability of success.

Assuntos

Impressões Digitais de DNA/estatística & dados numéricos , Bases de Dados Factuais , Fenômenos Químicos , Físico-Química , Receptores de Superfície Celular/genética , Receptores de Superfície Celular/fisiologia , Relação Estrutura-Atividade

5.

Oxadiazols: a new class of rationally designed anti-human immunodeficiency virus compounds targeting the nuclear localization signal of the viral matrix protein.

Haffar, Omar; Dubrovsky, Larisa; Lowe, Richard; Berro, Reem; Kashanchi, Fatah; Godden, Jeffrey; Vanpouille, Christophe; Bajorath, Jürgen; Bukrinsky, Michael.

J Virol ; 79(20): 13028-36, 2005 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-16189005

RESUMO

Despite recent progress in anti-human immunodeficiency virus (HIV) therapy, drug toxicity and emergence of drug-resistant isolates during long-term treatment of HIV-infected patients necessitate the search for new targets that can be used to develop novel antiviral agents. One such target is the process of nuclear translocation of the HIV preintegration complex. Previously we described a class of arylene bis(methylketone) compounds that inhibit HIV-1 nuclear import by targeting the nuclear localization signal (NLS) in the matrix protein (MA). Here we report a different class of MA NLS-targeting compounds that was selected using computer-assisted drug design. The leading compound from this group, ITI-367, showed potent anti-HIV activity in cultures of T lymphocytes and macrophages and also inhibited HIV-1 replication in ex vivo cultured lymphoid tissue. The virus carrying inactivating mutations in MA NLS was resistant to ITI-367. Analysis by real-time PCR demonstrated that the compound specifically inhibited nuclear import of viral DNA, measured by two-long terminal repeat circle formation. Evidence of the existence of this mechanism was provided by immunofluorescent microscopy, using fluorescently labeled HIV-1, which demonstrated retention of the viral DNA in the cytoplasm of drug-treated macrophages. Compounds inhibiting HIV-1 nuclear import may be attractive candidates for further development.

Assuntos

Fármacos Anti-HIV/farmacologia , HIV-1/efeitos dos fármacos , Oxidiazóis/farmacologia , Fármacos Anti-HIV/química , Células Cultivadas , Desenho Assistido por Computador , Relação Dose-Resposta a Droga , Produtos do Gene gag/química , Antígenos HIV/química , HIV-1/química , Humanos , Modelos Moleculares , Monócitos , Sinais de Localização Nuclear/efeitos dos fármacos , Oxidiazóis/química , Proteínas Virais/química , Replicação Viral/efeitos dos fármacos , Produtos do Gene gag do Vírus da Imunodeficiência Humana

6.

POT-DMC: A virtual screening method for the identification of potent hits.

Godden, Jeffrey W; Stahura, Florence L; Bajorath, Jürgen.

J Med Chem ; 47(23): 5608-11, 2004 Nov 04.

Artigo em Inglês | MEDLINE | ID: mdl-15509158

RESUMO

A method for ligand-based virtual screening (LBVS), dynamic mapping of consensus positions (DMC), has been extended to take different potency levels of template compounds into account. This potency scaling technique is designed to tune search calculations toward the detection of increasingly potent hits. LBVS analysis of three different compound classes confirmed the ability of potency-scaled DMC (POT-DMC) to identify active database compounds with higher potency than conventional calculations.

Assuntos

Antagonistas dos Receptores CCR5 , Hormônio Liberador de Gonadotropina/agonistas , Relação Quantitativa Estrutura-Atividade , Agonistas do Receptor 5-HT3 de Serotonina , Bases de Dados Factuais , Hormônio Liberador de Gonadotropina/química , Receptores CCR5/química , Receptores 5-HT3 de Serotonina/química

7.

Similarity search profiles as a diagnostic tool for the analysis of virtual screening calculations.

Xue, Ling; Godden, Jeffrey W; Stahura, Florence L; Bajorath, Jürgen.

J Chem Inf Comput Sci ; 44(4): 1275-81, 2004.

Artigo em Inglês | MEDLINE | ID: mdl-15272835

RESUMO

An analysis method termed similarity search profiling has been developed to evaluate fingerprint-based virtual screening calculations. The analysis is based on systematic similarity search calculations using multiple template compounds over the entire value range of a similarity coefficient. In graphical representations, numbers of correctly identified hits and other detected database compounds are separately monitored. The resulting profiles make it possible to determine whether a virtual screening trial can in principle succeed for a given compound class, search tool, similarity metric, and selection criterion. As a test case, we have analyzed virtual screening calculations using a recently designed fingerprint on 23 different biological activity classes in a compound source database containing approximately 1.3 million molecules. Based on our predefined selection criteria, we found that virtual screening analysis was successful for 19 of 23 compound classes. Profile analysis also makes it possible to determine compound class-specific similarity threshold values for similarity searching.

Assuntos

Avaliação Pré-Clínica de Medicamentos/estatística & dados numéricos , Interface Usuário-Computador , Bases de Dados Factuais , Estrutura Molecular , Relação Estrutura-Atividade

8.

Partitioning in binary-transformed chemical descriptor spaces.

Godden, Jeffrey W; Bajorath, Jürgen.

Methods Mol Biol ; 275: 291-300, 2004.

Artigo em Inglês | MEDLINE | ID: mdl-15141117

RESUMO

Here we describe a statistically based partitioning method called median partitioning (MP), which involves the transformation of value distributions of molecular property descriptors into a binary classification scheme. The MP approach fundamentally differs from other partitioning approaches that involve dimension reduction of chemical spaces such as cell-based partitioning, since MP directly operates in original, albeit simplified, chemical space. Modified versions of the MP algorithm have been implemented and successfully applied in diversity selection, compound classification, and virtual screening. These findings have demonstrated that dimension reduction techniques, although elegant in their design, are not necessarily required for effective partitioning of molecular datasets. An attractive feature of statistical partitioning approaches such as decision tree methods or MP is their computational efficiency, which is becoming an important criterion for the analysis of compound databases containing millions of molecules.

Assuntos

Química , Algoritmos , Fenômenos Químicos , Serviços de Informação

9.

Molecular similarity analysis and virtual screening by mapping of consensus positions in binary-transformed chemical descriptor spaces with variable dimensionality.

Godden, Jeffrey W; Furr, John R; Xue, Ling; Stahura, Florence L; Bajorath, Jürgen.

J Chem Inf Comput Sci ; 44(1): 21-9, 2004.

Artigo em Inglês | MEDLINE | ID: mdl-14741007

RESUMO

A novel compound classification algorithm is described that operates in binary molecular descriptor spaces and groups active compounds together in a computationally highly efficient manner. The method involves the transformation of continuous descriptor value ranges into a binary format, subsequent definition of simplified descriptor spaces, identification of consensus positions of specific compound sets in these spaces, and iterative adjustments of the dimensionality of the descriptor spaces in order to discriminate compounds sharing similar activity from others. We term this approach Dynamic Mapping of Consensus positions (DMC) because the definition of reference spaces is tuned toward specific compound classes and their dimensionality is increased as the analysis proceeds. When applied to virtual screening, sets of bait compounds are added to a large screening database to identify hidden active molecules. In these calculations, molecules that map to consensus positions after elimination of most of the database compounds are considered hit candidates. In a benchmark study on five biological activity classes, hits for randomly assembled sets of bait molecules were correctly identified in 95% of virtual screening calculations in a source database containing more than 1.3 million molecules, thus providing a measure of the sensitivity of the DMC technique.

10.

Design and evaluation of a molecular fingerprint involving the transformation of property descriptor values into a binary classification scheme.

Xue, Ling; Godden, Jeffrey W; Stahura, Florence L; Bajorath, Jürgen.

J Chem Inf Comput Sci ; 43(4): 1151-7, 2003.

Artigo em Inglês | MEDLINE | ID: mdl-12870906

RESUMO

A new fingerprint design concept is introduced that transforms molecular property descriptors into two-state descriptors and thus permits binary encoding. This transformation is based on the calculation of statistical medians of descriptor distributions in large compound collections and alleviates the need for value range encoding of these descriptors. For binary encoded property descriptors, bit positions that are set off capture as much information as bit positions that are set on, different from conventional fingerprint representations. Accordingly, a variant of the Tanimoto coefficient has been defined for comparison of these fingerprints. Following our design idea, a prototypic fingerprint termed MP-MFP was implemented by combining 61 binary encoded property descriptors with 110 structural fragment-type descriptors. The performance of this fingerprint was evaluated in systematic similarity search calculations in a database containing 549 molecules belonging to 38 different activity classes and 5000 background molecules. In these calculations, MP-MFP correctly recognized approximately 34% of all similarity relationships, with only 0.04% false positives, and performed better than previous designs and MACCS keys. The results suggest that combinations of simplified two-state property descriptors have predictive value in the analysis of molecular similarity.

Assuntos

Modelos Químicos , Estrutura Molecular , Preparações Farmacêuticas/classificação , Metodologias Computacionais , Bases de Dados Factuais , Desenho de Fármacos , Farmacologia , Estatística como Assunto/métodos

11.

Profile scaling increases the similarity search performance of molecular fingerprints containing numerical descriptors and structural keys.

Xue, Ling; Godden, Jeffrey W; Stahura, Florence L; Bajorath, Jürgen.

J Chem Inf Comput Sci ; 43(4): 1218-25, 2003.

Artigo em Inglês | MEDLINE | ID: mdl-12870914

RESUMO

The concept of compound class-specific profiling and scaling of molecular fingerprints for similarity searching is discussed and applied to newly designed fingerprint representations. The approach is based on the analysis of characteristic patterns of bits in keyed fingerprints that are set on in compounds having equivalent biological activity. Once a fingerprint profile is generated for a particular activity class, scaling factors that are weighted according to observed bit frequencies are applied to signature bit positions when searching for similar compounds. In systematic similarity search calculations over 23 diverse activity classes, profile scaling consistently increased the performance of fingerprints containing property descriptors and/or structural keys. A significant improvement of approximately 15% was observed for a new fingerprint consisting of binary encoded molecular property descriptors and structural keys. Under scaling conditions, this fingerprint, termed MP-MFP, correctly recognized on average close to 60% of all active test compounds, with only a few false positives. MP-MFP outperformed MACCS keys and other reference fingerprints. In general, optimum performance in scaling calculations was achieved at higher threshold values of the Tanimoto coefficient than in nonscaled calculations, thereby increasing the search selectivity. In general, putting relatively high weight on signature bit positions that were always, or almost always, set on was found to be the most effective scaling procedure. Analysis of class-specific search performance revealed that profile scaling of MP-MFP improved the similarity search results for each of the 23 activity classes.

12.

Recursive median partitioning for virtual screening of large databases.

Godden, Jeffrey W; Furr, John R; Bajorath, Jürgen.

J Chem Inf Comput Sci ; 43(1): 182-8, 2003.

Artigo em Inglês | MEDLINE | ID: mdl-12546552

RESUMO

Recently, we have introduced the median partitioning (MP) method for diversity selection and compound classification. The MP approach utilizes property descriptors with continuous value ranges, transforms these descriptors into a binary classification scheme by determining their medians in source databases, and divides database molecules in subsequent steps into populations above or below these medians. Having previously demonstrated the usefulness of MP for the classification of molecules according to biological activity, we have now gone a step further and extended the methodology for application in virtual screening. In these calculations, a series of bait molecules having desired activity is added to large compound databases, and subsequent iterations or recursions are carried out to reduce the number of candidate molecules until a small number of compounds are found in partitions enriched with bait molecules. For each recursion step, descriptor combinations are identified that copartition as many active molecules as possible. Descriptor selection is facilitated by application of a genetic algorithm (GA). The recursive MP approach (RMP) has been applied to five diverse biological activity classes in virtual screening of a database consisting of approximately 1.34 million molecules to which different types of active compounds were added. RMP analysis produced hit rates of up to 21%, dependent on the biological activity class, and led to an average approximately 3600-fold improvement over random selection for the activity classes that were used as test cases.

Assuntos

Avaliação Pré-Clínica de Medicamentos/estatística & dados numéricos , Interface Usuário-Computador , Algoritmos , Biometria , Bases de Dados Factuais

13.

Classification of biologically active compounds by median partitioning.

Godden, Jeffrey W; Xue, Ling; Bajorath, Jürgen.

J Chem Inf Comput Sci ; 42(5): 1263-9, 2002.

Artigo em Inglês | MEDLINE | ID: mdl-12377018

RESUMO

The median partitioning (MP) method was originally developed for the selection of diverse subsets from compound databases. Following this approach, property descriptors are used in subsequent steps to divide compounds into defined partitions from which representative molecules are selected. For descriptor analysis, MP was coupled to a genetic algorithm. MP subset selection does not depend on pairwise comparison of molecules and is therefore applicable to very large compound pools. Here the MP approach was evaluated for the classification of molecules according to biological activity. A total of 317 molecules belonging to 21 different activity classes were studied. MP compound classification calculations were carried out both in the presence and absence of 2000 randomly selected "background" molecules. The performance of MP was compared to cell-based partitioning and found to be at least comparable, with up to approximately 82% of active molecules occurring in "pure" partitions consisting only of molecules sharing the same activity. Different from cell-based methods, MP classification is based on "direct" and "sequential" contributions of molecular property descriptors. Our results suggest that MP in not only an effective method for the selection of diverse subsets but also for the classification of active compounds and searching for molecules with desired activity.

Assuntos

Preparações Farmacêuticas/classificação , Simulação por Computador , Bases de Dados Factuais , Modelos Químicos , Farmacologia/estatística & dados numéricos

14.

Median Partitioning: a novel method for the selection of representative subsets from large compound pools.

Godden, Jeffrey W; Xue, Ling; Kitchen, Douglas B; Stahura, Florence L; Schermerhorn, E James; Bajorath, Jürgen.

J Chem Inf Comput Sci ; 42(4): 885-93, 2002.

Artigo em Inglês | MEDLINE | ID: mdl-12132890

RESUMO

A method termed Median Partitioning (MP) has been developed to select diverse sets of molecules from large compound pools. Unlike many other methods for subset selection, the MP approach does not depend on pairwise comparison of molecules and can therefore be applied to very large compound collections. The only time limiting step is the calculation of molecular descriptors for database compounds. MP employs arrays of property descriptors with little correlation to divide large compound pools into partitions from which representative molecules can be selected. In each of n subsequent steps, a population of molecules is divided into subpopulations above and below the median value of a property descriptor until a desired number of 2n partitions are obtained. For descriptor evaluation and selection, an entropy formulation was embedded in a genetic algorithm. MP has been applied here to generate a subset of the Available Chemicals Directory, and the results have been compared with cell-based partitioning.

Assuntos

Técnicas de Química Combinatória , Simulação por Computador , Desenho de Fármacos , Algoritmos , Bases de Dados Factuais

15.

Methods for compound selection focused on hits and application in drug discovery.

Stahura, Florence L; Xue, Ling; Godden, Jeffrey W; Bajorath, Jürgen.

J Mol Graph Model ; 20(6): 439-46, 2002 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-12071278

RESUMO

In the context of virtual screening calculations, a multiple fingerprint-based metric is applied to generate focused compound libraries by database searching. Different fingerprints are used to facilitate a similarity step for database mining, followed by a diversity step to assemble the final library. The method is applied, for example, to build libraries of limited size for hit-to-lead development efforts. In studies designed to inhibit a therapeutically relevant protein-protein interaction, small molecular hits were initially obtained by combined fingerprint- and structure-based virtual screening and used for the design of focused libraries. We review the applied virtual screening approach and report the statistics and results of screening as well as focused library design. While the structures of lead compounds cannot be disclosed, the analysis is thought to provide an example of the interplay of different methods applied in practical lead identification.

Assuntos

Desenho de Fármacos , Biblioteca de Peptídeos , Relação Quantitativa Estrutura-Atividade , Sítios de Ligação , Bases de Dados Factuais , Proteínas de Membrana/química , Proteínas de Membrana/metabolismo , Modelos Moleculares , Estrutura Molecular , Ligação Proteica , Estrutura Terciária de Proteína , Proteínas Proto-Oncogênicas c-bcl-2/antagonistas & inibidores , Proteínas Proto-Oncogênicas c-bcl-2/química , Proteínas Proto-Oncogênicas c-bcl-2/metabolismo , Proteína Killer-Antagonista Homóloga a bcl-2 , Proteína bcl-X

16.

Differential Shannon entropy analysis identifies molecular property descriptors that predict aqueous solubility of synthetic compounds with high accuracy in binary QSAR calculations.

Stahura, Florence L; Godden, Jeffrey W; Bajorath, Jürgen.

J Chem Inf Comput Sci ; 42(3): 550-8, 2002.

Artigo em Inglês | MEDLINE | ID: mdl-12086513

RESUMO

Prediction of aqueous solubility of organic molecules by binary QSAR was used as a test case for a recently introduced entropy-based descriptor selection method. Property descriptors suitable for solubility predictions were exclusively selected on the basis of Shannon entropy calculations in molecular learning sets, not taking any other information into account. Sets of only five or 10 2D descriptors with largest entropy differences between molecules above or below a defined solubility threshold yielded consistently high prediction accuracy between 80% and 90% in binary QSAR calculations, regardless of the threshold values applied. The top five descriptors with largest differential Shannon entropy (DSE) values achieved an average prediction accuracy of 88%. These findings suggest that differences in entropy and relative information content of descriptors in compared compound data sets correlate with significant differences in physical properties and support the practical relevance of entropy-based descriptor selection routines. The study also demonstrates that binary QSAR methodology can be effectively used to classify small molecules according to aqueous solubility.

17.

Chemical descriptors with distinct levels of information content and varying sensitivity to differences between selected compound databases identified by SE-DSE analysis.

Godden, Jeffrey W; Bajorath, Jürgen.

J Chem Inf Comput Sci ; 42(1): 87-93, 2002.

Artigo em Inglês | MEDLINE | ID: mdl-11855971

RESUMO

Analysis of the variability of molecular descriptors in large compound databases has recently been carried out using both the Shannon entropy (SE) and differential Shannon entropy (DSE) concepts that reduce descriptor distributions to their information content (SE analysis) and detect intrinsic differences between descriptor settings in compound databases (DSE analysis). Here it is shown that a combination of SE and DSE calculations, termed SE-DSE analysis, makes it possible to identify molecular descriptors most sensitive to systematic differences in databases consisting of synthetic, drug-like, and natural molecules. Descriptors with consistently high information content are detected, and database-specific differences are quantified. Different sets of only very few descriptors were found to be most responsive to principal differences between synthetic, natural, and drug-like molecules. Descriptors with DSE values furthest away from zero are likely to best distinguish between compounds with different characteristics. SE-DSE analysis also reveals that a number of descriptors are not sensitive to compound class-specific features, despite their complexity and consistently high information content.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA