Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 63
Filtrar
1.
Artigo em Inglês | MEDLINE | ID: mdl-38324429

RESUMO

The adversarial vulnerability of convolutional neural networks (CNNs) refers to the performance degradation of CNNs under adversarial attacks, leading to incorrect decisions. However, the causes of adversarial vulnerability in CNNs remain unknown. To address this issue, we propose a unique cross-scale analytical approach from a statistical physics perspective. It reveals that the huge amount of nonlinear effects inherent in CNNs is the fundamental cause for the formation and evolution of system vulnerability. Vulnerability is spontaneously formed on the macroscopic level after the symmetry of the system is broken through the nonlinear interaction between microscopic state order parameters. We develop a cascade failure algorithm, visualizing how micro perturbations on neurons' activation can cascade and influence macro decision paths. Our empirical results demonstrate the interplay between microlevel activation maps and macrolevel decision-making and provide a statistical physics perspective to understand the causality behind CNN vulnerability. Our work will help subsequent research to improve the adversarial robustness of CNNs.

2.
iScience ; 26(11): 108285, 2023 Nov 17.
Artigo em Inglês | MEDLINE | ID: mdl-38026198

RESUMO

It is a critical step in lead optimization to evaluate the absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties of drug-like compounds. Classical single-task learning (STL) has effectively predicted individual ADMET endpoints with abundant labels. Conversely, multi-task learning (MTL) can predict multiple ADMET endpoints with fewer labels, but ensuring task synergy and highlighting key molecular substructures remain challenges. To tackle these issues, this work elaborates a multi-task graph learning framework for predicting multiple ADMET properties of drug-like small molecules (MTGL-ADMET) by holding a new paradigm of MTL, "one primary, multiple auxiliaries." It first adeptly combines status theory with maximum flow for auxiliary task selection. The subsequent phase introduces a primary-task-centric MTL model with integrated modules. MTGL-ADMET not only outstrips existing STL and MTL methods but also offers a transparent lens into crucial molecular substructures. It is anticipated that this work can promote lead compound finding and optimization in drug discovery.

3.
Artigo em Inglês | MEDLINE | ID: mdl-37792659

RESUMO

In the Internet of Medical Things (IoMT), de novo peptide sequencing prediction is one of the most important techniques for the fields of disease prediction, diagnosis, and treatment. Recently, deep-learning-based peptide sequencing prediction has been a new trend. However, most popular deep learning models for peptide sequencing prediction suffer from poor interpretability and poor ability to capture long-range dependencies. To solve these issues, we propose a model named SeqNovo, which has the encoding-decoding structure of sequence to sequence (Seq2Seq), the highly nonlinear properties of multilayer perceptron (MLP), and the ability of the attention mechanism to capture long-range dependencies. SeqNovo use MLP to improve the feature extraction and utilize the attention mechanism to discover key information. A series of experiments have been conducted to show that the SeqNovo is superior to the Seq2Seq benchmark model, DeepNovo. SeqNovo improves both the accuracy and interpretability of the predictions, which will be expected to support more related research.

4.
Front Microbiol ; 13: 944952, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35707165

RESUMO

[This corrects the article DOI: 10.3389/fmicb.2022.846915.].

5.
Bioinformatics ; 38(Suppl 1): i325-i332, 2022 06 24.
Artigo em Inglês | MEDLINE | ID: mdl-35758801

RESUMO

MOTIVATION: During lead compound optimization, it is crucial to identify pathways where a drug-like compound is metabolized. Recently, machine learning-based methods have achieved inspiring progress to predict potential metabolic pathways for drug-like compounds. However, they neglect the knowledge that metabolic pathways are dependent on each other. Moreover, they are inadequate to elucidate why compounds participate in specific pathways. RESULTS: To address these issues, we propose a novel Multi-Label Graph Learning framework of Metabolic Pathway prediction boosted by pathway interdependence, called MLGL-MP, which contains a compound encoder, a pathway encoder and a multi-label predictor. The compound encoder learns compound embedding representations by graph neural networks. After constructing a pathway dependence graph by re-trained word embeddings and pathway co-occurrences, the pathway encoder learns pathway embeddings by graph convolutional networks. Moreover, after adapting the compound embedding space into the pathway embedding space, the multi-label predictor measures the proximity of two spaces to discriminate which pathways a compound participates in. The comparison with state-of-the-art methods on KEGG pathways demonstrates the superiority of our MLGL-MP. Also, the ablation studies reveal how its three components contribute to the model, including the pathway dependence, the adapter between compound embeddings and pathway embeddings, as well as the pre-training strategy. Furthermore, a case study illustrates the interpretability of MLGL-MP by indicating crucial substructures in a compound, which are significantly associated with the attending metabolic pathways. It is anticipated that this work can boost metabolic pathway predictions in drug discovery. AVAILABILITY AND IMPLEMENTATION: The code and data underlying this article are freely available at https://github.com/dubingxue/MLGL-MP.


Assuntos
Aprendizado de Máquina , Redes Neurais de Computação , Descoberta de Drogas , Redes e Vias Metabólicas , Software
6.
Front Microbiol ; 13: 846915, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35479616

RESUMO

Many drugs can be metabolized by human microbes; the drug metabolites would significantly alter pharmacological effects and result in low therapeutic efficacy for patients. Hence, it is crucial to identify potential drug-microbe associations (DMAs) before the drug administrations. Nevertheless, traditional DMA determination cannot be applied in a wide range due to the tremendous number of microbe species, high costs, and the fact that it is time-consuming. Thus, predicting possible DMAs in computer technology is an essential topic. Inspired by other issues addressed by deep learning, we designed a deep learning-based model named Nearest Neighbor Attention Network (NNAN). The proposed model consists of four components, namely, a similarity network constructor, a nearest-neighbor aggregator, a feature attention block, and a predictor. In brief, the similarity block contains a microbe similarity network and a drug similarity network. The nearest-neighbor aggregator generates the embedding representations of drug-microbe pairs by integrating drug neighbors and microbe neighbors of each drug-microbe pair in the network. The feature attention block evaluates the importance of each dimension of drug-microbe pair embedding by a set of ordinary multi-layer neural networks. The predictor is an ordinary fully-connected deep neural network that functions as a binary classifier to distinguish potential DMAs among unlabeled drug-microbe pairs. Several experiments on two benchmark databases are performed to evaluate the performance of NNAN. First, the comparison with state-of-the-art baseline approaches demonstrates the superiority of NNAN under cross-validation in terms of predicting performance. Moreover, the interpretability inspection reveals that a drug tends to associate with a microbe if it finds its top-l most similar neighbors that associate with the microbe.

7.
Drug Discov Today ; 27(5): 1350-1366, 2022 05.
Artigo em Inglês | MEDLINE | ID: mdl-35248748

RESUMO

The screening of compound-protein interactions (CPIs) is one of the most crucial steps in finding hit and lead compounds. Deep learning (DL) methods for CPI prediction can address intrinsic limitations of traditional HTS and virtual screening with the advantage of low cost and high efficiency. This review provides a comprehensive survey of DL-based CPI prediction. It first summarizes popular databases of small-molecule compounds, proteins and binding complexes. Then, it outlines classical representations of compounds and proteins in turn. After that, this review briefly introduces state-of-the-art DL-based models in terms of design paradigms and investigates their prediction performance. Finally, it indicates current challenges and trends toward better CPI prediction and sketches out crucial approaches toward practical applications.


Assuntos
Aprendizado Profundo , Bases de Dados Factuais , Descoberta de Drogas/métodos , Proteínas/metabolismo
8.
J Cheminform ; 11(1): 28, 2019 Apr 08.
Artigo em Inglês | MEDLINE | ID: mdl-30963300

RESUMO

BACKGROUND: Because drug-drug interactions (DDIs) may cause adverse drug reactions or contribute to complex-disease treatments, it is important to identify DDIs before multiple-drug medications are prescribed. As the alternative of high-cost experimental identifications, computational approaches provide a much cheaper screening for potential DDIs on a large scale manner. Nevertheless, most of them only predict whether or not one drug interacts with another, but neglect their enhancive (positive) and depressive (negative) changes of pharmacological effects. Moreover, these comprehensive DDIs do not occur at random, but exhibit a weakly balanced relationship (a structural property when considering the DDI network), which would help understand how high-order DDIs work. RESULTS: This work exploits the intrinsically structural relationship to solve two tasks, including drug community detection as well as comprehensive DDI prediction in the cold-start scenario. Accordingly, we first design a balance regularized semi-nonnegative matrix factorization (BRSNMF) to partition the drugs into communities. Then, to predict enhancive and degressive DDIs in the cold-start scenario, we develop a BRSNMF-based predictive approach, which technically leverages drug-binding proteins (DBP) as features to associate new drugs (having no known DDI) with other drugs (having known DDIs). Our experiments demonstrate that BRSNMF can generate the drug communities, which exhibit more reasonable sizes, the property of weak balance as well as pharmacological significances. Moreover, they demonstrate the superiority of DBP features and the inspiring ability of the BRSNMF-based predictive approach on comprehensive DDI prediction with 94% accuracy among top-50 predicted enhancive and 86% accuracy among bottom-50 predicted degressive DDIs. CONCLUSIONS: Owing to the regularization of the weak balance property of the comprehensive DDI network into semi-nonnegative matrix factorization, our proposed BRSNMF is able to not only generate better drug communities but also provide an inspiring comprehensive DDI prediction in the cold-start scenario.

9.
BMC Genomics ; 20(Suppl 10): 914, 2019 Dec 30.
Artigo em Inglês | MEDLINE | ID: mdl-31888459

RESUMO

BACKGROUND: Identification of antibiotic resistance genes from environmental samples has been a critical sub-domain of gene discovery which is directly connected to human health. However, it is drawing extraordinary attention in recent years and regarded as a severe threat to human health by many institutions around the world. To satisfy the needs for efficient ARG discovery, a series of online antibiotic resistance gene databases have been published. This article will conduct an in-depth analysis of CARD, one of the most widely used ARG databases. RESULTS: The decision model of CARD is based the alignment score with a single ARG type. We discover the occasions where the model is likely to make false prediction, and then propose an optimization method on top of the current CARD model. The optimization is expected to raise the coherence with BLAST homology relationships and improve the confidence for identification of ARGs using the database. CONCLUSIONS: The absence of public recognized benchmark makes it challenging to evaluate the performance of ARG identification. However, possible wrong predictions and methods for resolving the problem can be inferred by computational analysis of the identification method and the underlying reference sequences. We hope our work can bring insight to the mission of precise ARG type classifications.


Assuntos
Resistência Microbiana a Medicamentos/genética , Sequenciamento de Nucleotídeos em Larga Escala , Modelos Genéticos , Ontologia Genética , Homologia de Sequência do Ácido Nucleico , Máquina de Vetores de Suporte
10.
Comput Methods Programs Biomed ; 168: 1-10, 2019 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-30527128

RESUMO

BACKGROUND AND OBJECTIVE: Due to the synergistic effects of drugs, drug combination is one of the effective approaches for treating complex diseases. However, the identification of drug combinations by dose-response methods is still costly. It is promising to develop supervised learning-based approaches to predict potential drug combinations on a large scale. Nevertheless, these approaches have the inadequate utilization of heterogeneous features, which causes the loss of information useful to classification. Moreover, they have an intrinsic bias, because they assume unknown drug pairs as non-combinations, of which some could be real drug combinations in practice. METHODS: To address above issues, this work first designs a two-layer multiple classifier system (TLMCS) to effectively integrate heterogeneous features involving anatomical therapeutic chemical codes of drugs, drug-drug interactions, drug-target interactions, gene ontology of drug targets, and side effects. To avoid the bias caused by labelling unknown samples as negative, it then utilizes the one-class support vector machines, (which requires no negative instance and only labels approved drug combinations as positive instances), as the member classifiers in TLMCS. Last, both a 10-fold cross validation (10-CV) and a novel prediction are performed to validate the performance of TLMCS. RESULTS: The comparison with three state-of-the-art approaches under 10-CV exhibits the superiority of TLMCS, which achieves the area under the receiver operating characteristic curve = 0.824 and the area under the precision-recall curve = 0.372. Moreover, the experiment under the novel prediction demonstrates its ability, where 9 out of the top-20 predicted combinative drug pairs are validated by checking the published literature. Furthermore, for each of the newly-validated drug combinations, this work analyses the combining mode of the member drugs and investigates their relationship in terms of drug targeting pathways. CONCLUSIONS: The proposed TLMCS provides an effective framework to integrate those heterogeneous features and is trained by only positive samples such that the bias of taking unknown drug pairs as negative samples can be avoided. Furthermore, its results in the novel prediction reveal five types of drug combinations and three types of drug relationships in terms of pathways.


Assuntos
Combinação de Medicamentos , Avaliação Pré-Clínica de Medicamentos/métodos , Interações Medicamentosas , Preparações Farmacêuticas/química , Preparações Farmacêuticas/classificação , Farmácia/instrumentação , Algoritmos , Biologia Computacional , Simulação por Computador , Bases de Dados Factuais , Humanos , Farmácia/métodos , Curva ROC , Software
11.
BMC Bioinformatics ; 19(Suppl 14): 411, 2018 Nov 20.
Artigo em Inglês | MEDLINE | ID: mdl-30453924

RESUMO

BACKGROUND: A significant number of adverse drug reactions is caused by unexpected Drug-drug interactions (DDIs). The identification of DDIs becomes crucial before the co-prescription of multiple drugs is made. Such a task in clinics or in drug discovery usually requires high costs and numerous limitations, while computational approaches are able to predict potential DDIs effectively by utilizing diverse drug attributes (e.g. side effects). Nevertheless, they're incapable when required to predict enhancive and degressive DDIs, which change increasingly and decreasingly the pharmacological behavior of interacting drugs respectively. The pharmacological change of DDIs is one of the most important factors when making a multi-drug prescription. RESULTS: In this work, we design a Triple Matrix Factorization-based Unified Framework (TMFUF) to address the above issue. By leveraging a group of side effect entries of drugs, TMFUF achieves the inspiring result (AUC = 0.842 and AUPR = 0.526) in the case of conventional DDI prediction under the traditional screening task. In the comparison with two state-of-the-art approaches, TMFUF demonstrates it superiority by ~ 7% and ~ 20% improvement in terms of AUC and AUPR respectively. More importantly, TMFUF shows its ability in the comprehensive DDI prediction under different screening tasks. Finally, a utilization TMFUF reveals the significant pairs of side effects, which contribute to form enhancive and degressive DDIs, for further clinical validation. CONCLUSIONS: The proposed TMFUF is first capable to predict both conventional binary DDIs and comprehensive DDIs such that it captures the pharmacological changes caused by DDIs. Furthermore, it provides a unified solution of DDI prediction for two screening scenarios, which involves newly given drugs having no prior interaction. Another advantage is its ability to indicate how significantly the pairs of drug features contribute to form DDIs.


Assuntos
Algoritmos , Interações Medicamentosas , Humanos , Análise dos Mínimos Quadrados , Curva ROC , Reprodutibilidade dos Testes
12.
BMC Bioinformatics ; 19(Suppl 9): 281, 2018 Aug 13.
Artigo em Inglês | MEDLINE | ID: mdl-30367598

RESUMO

BACKGROUND: Human Microbiome Project reveals the significant mutualistic influence between human body and microbes living in it. Such an influence lead to an interesting phenomenon that many noninfectious diseases are closely associated with diverse microbes. However, the identification of microbe-noninfectious disease associations (MDAs) is still a challenging task, because of both the high cost and the limitation of microbe cultivation. Thus, there is a need to develop fast approaches to screen potential MDAs. The growing number of validated MDAs enables us to meet the demand in a new insight. Computational approaches, especially machine learning, are promising to predict MDA candidates rapidly among a large number of microbe-disease pairs with the advantage of no limitation on microbe cultivation. Nevertheless, a few computational efforts at predicting MDAs are made so far. RESULTS: In this paper, grouping a set of MDAs into a binary MDA matrix, we propose a novel predictive approach (BMCMDA) based on Binary Matrix Completion to predict potential MDAs. The proposed BMCMDA assumes that the incomplete observed MDA matrix is the summation of a latent parameterizing matrix and a noising matrix. It also assumes that the independently occurring subscripts of observed entries in the MDA matrix follows a binomial model. Adopting a standard mean-zero Gaussian distribution for the nosing matrix, we model the relationship between the parameterizing matrix and the MDA matrix under the observed microbe-disease pairs as a probit regression. With the recovered parameterizing matrix, BMCMDA deduces how likely a microbe would be associated with a particular disease. In the experiment under leave-one-out cross-validation, it exhibits the inspiring performance (AUC = 0.906, AUPR =0.526) and demonstrates its superiority by ~ 7% and ~ 5% improvements in terms of AUC and AUPR respectively in the comparison with the pioneering approach KATZHMDA. CONCLUSIONS: Our BMCMDA provides an effective approach for predicting MDAs and can be also extended to other similar predicting tasks of binary relationship (e.g. protein-protein interaction, drug-target interaction).


Assuntos
Algoritmos , Bactérias , Biologia Computacional/métodos , Doença , Microbiota , Modelos Biológicos , Fenômenos Fisiológicos Bacterianos , Interações Hospedeiro-Patógeno , Humanos , Fatores de Risco
13.
Sci Rep ; 8(1): 11829, 2018 08 07.
Artigo em Inglês | MEDLINE | ID: mdl-30087377

RESUMO

Drug-drug interactions (DDIs) may trigger adverse drug reactions, which endanger the patients. DDI identification before making clinical medications is critical but bears a high cost in clinics. Computational approaches, including global model-based and local model based, are able to screen DDI candidates among a large number of drug pairs by utilizing preliminary characteristics of drugs (e.g. drug chemical structure). However, global model-based approaches are usually slow and don't consider the topological structure of DDI network, while local model-based approaches have the degree-induced bias that a new drug tends to link to the drug having many DDI. All of them lack an effective ensemble method to combine results from multiple predictors. To address the first two issues, we propose a local classification-based model (LCM), which considers the topology of DDI network and has the relaxation of the degree-induced bias. Furthermore, we design a novel supervised fusion rule based on the Dempster-Shafer theory of evidence (LCM-DS), which aggregates the results from multiple LCMs. To make the final prediction, LCM-DS integrates three aspects from multiple classifiers, including the posterior probabilities output by individual classifiers, the proximity between their instance decision profiles and their reference profiles, as well as the quality of their reference profiles. Last, the substantial comparison with three state-of-the-art approaches demonstrates the effectiveness of our LCM, and the comparison with both individual LCM implementations and classical fusion algorithms exhibits the superiority of our LCM-DS.


Assuntos
Algoritmos , Interações Medicamentosas , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/diagnóstico , Modelos Teóricos , Bases de Dados Factuais , Tomada de Decisões , Combinação de Medicamentos , Descoberta de Drogas , Quimioterapia Combinada , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/etiologia , Humanos
14.
BMC Syst Biol ; 12(Suppl 1): 14, 2018 04 11.
Artigo em Inglês | MEDLINE | ID: mdl-29671393

RESUMO

BACKGROUND: Drug-drug interactions (DDIs) always cause unexpected and even adverse drug reactions. It is important to identify DDIs before drugs are used in the market. However, preclinical identification of DDIs requires much money and time. Computational approaches have exhibited their abilities to predict potential DDIs on a large scale by utilizing pre-market drug properties (e.g. chemical structure). Nevertheless, none of them can predict two comprehensive types of DDIs, including enhancive and degressive DDIs, which increases and decreases the behaviors of the interacting drugs respectively. There is a lack of systematic analysis on the structural relationship among known DDIs. Revealing such a relationship is very important, because it is able to help understand how DDIs occur. Both the prediction of comprehensive DDIs and the discovery of structural relationship among them play an important guidance when making a co-prescription. RESULTS: In this work, treating a set of comprehensive DDIs as a signed network, we design a novel model (DDINMF) for the prediction of enhancive and degressive DDIs based on semi-nonnegative matrix factorization. Inspiringly, DDINMF achieves the conventional DDI prediction (AUROC = 0.872 and AUPR = 0.605) and the comprehensive DDI prediction (AUROC = 0.796 and AUPR = 0.579). Compared with two state-of-the-art approaches, DDINMF shows it superiority. Finally, representing DDIs as a binary network and a signed network respectively, an analysis based on NMF reveals crucial knowledge hidden among DDIs. CONCLUSIONS: Our approach is able to predict not only conventional binary DDIs but also comprehensive DDIs. More importantly, it reveals several key points about the DDI network: (1) both binary and signed networks show fairly clear clusters, in which both drug degree and the difference between positive degree and negative degree show significant distribution; (2) the drugs having large degrees tend to have a larger difference between positive degree and negative degree; (3) though the binary DDI network contains no information about enhancive and degressive DDIs at all, it implies some of their relationship in the comprehensive DDI matrix; (4) the occurrence of signs indicating enhancive and degressive DDIs is not random because the comprehensive DDI network is equipped with a structural balance.


Assuntos
Biologia Computacional/métodos , Interações Medicamentosas , Algoritmos
15.
BMC Syst Biol ; 12(Suppl 9): 136, 2018 12 31.
Artigo em Inglês | MEDLINE | ID: mdl-30598094

RESUMO

BACKGROUND: During the identification of potential candidates, computational prediction of drug-target interactions (DTIs) is important to subsequent expensive validation in wet-lab. DTI screening considers four scenarios, depending on whether the drug is an existing or a new drug and whether the target is an existing or a new target. However, existing approaches have the following limitations. First, only a few of them can address the most difficult scenario (i.e., predicting interactions between new drugs and new targets). More importantly, none of the existing approaches could provide the explicit information for understanding the mechanism of forming interactions, such as the drug-target feature pairs contributing to the interactions. RESULTS: In this paper, we propose a Triple Matrix Factorization-based model (TMF) to tackle these problems. Compared with former state-of-the-art predictive methods, TMF demonstrates its significant superiority by assessing the predictions on four benchmark datasets over four kinds of screening scenarios. Also, it exhibits its outperformance by validating predicted novel interactions. More importantly, by using PubChem fingerprints of chemical structures as drug features and occurring frequencies of amino acid trimer as protein features, TMF shows its ability to find out the features determining interactions, including dominant feature pairs, frequently occurring substructures, and conserved triplet of amino acids. CONCLUSIONS: Our TMF provides a unified framework of DTI prediction for all the screening scenarios. It also presents a new insight for the underlying mechanism of DTIs by indicating dominant features, which play important roles in the forming of DTI.


Assuntos
Biologia Computacional/métodos , Descoberta de Drogas , Benchmarking
16.
J Comput Biol ; 25(3): 253-269, 2018 03.
Artigo em Inglês | MEDLINE | ID: mdl-29028179

RESUMO

Given a distance matrix M that represents evolutionary distances between any two species, an edge-weighted phylogenetic network N is said to satisfy M if between any pair of species, there exists a path in N with a length equal to the corresponding entry in M. In this article, we consider a special class of networks called a one-articulated network, which is a proper superset of galled trees. We show that if the distance matrix M is derived from an ultrametric one-articulated network N (i.e., for any species X and Y, the entry [Formula: see text] is equal to the shortest distance between X and Y in N), we can re-construct a network that satisfies M in [Formula: see text] time, where n denotes the number of species; further, the reconstructed network is guaranteed to be the simplest, in a sense that the number of hybrid nodes is minimized. In addition, one may easily index a one-articulated network N with a minimum number of hybrid nodes in [Formula: see text] space, such that on any given phylogenetic tree T, we can determine whether T is contained in N (i.e., if a spanning subtree [Formula: see text] of N is a subdivision of T) in [Formula: see text] time.


Assuntos
Redes Neurais de Computação , Filogenia , Algoritmos
17.
Genome Biol ; 18(1): 230, 2017 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-29195502

RESUMO

We present a new method, OMSV, for accurately and comprehensively identifying structural variations (SVs) from optical maps. OMSV detects both homozygous and heterozygous SVs, SVs of various types and sizes, and SVs with or without creating or destroying restriction sites. We show that OMSV has high sensitivity and specificity, with clear performance gains over the latest method. Applying OMSV to a human cell line, we identified hundreds of SVs >2 kbp, with 68 % of them missed by sequencing-based callers. Independent experimental validation confirmed the high accuracy of these SVs. The OMSV software is available at http://yiplab.cse.cuhk.edu.hk/omsv/ .


Assuntos
Variação Estrutural do Genoma , Genômica/métodos , Software , Biologia Computacional/métodos , Simulação por Computador , Genoma Humano , Humanos
18.
BMC Bioinformatics ; 18(Suppl 12): 409, 2017 Oct 16.
Artigo em Inglês | MEDLINE | ID: mdl-29072137

RESUMO

BACKGROUND: Drug Combination is one of the effective approaches for treating complex diseases. However, determining combinative drug pairs in clinical trials is still costly. Thus, computational approaches are used to identify potential drug pairs in advance. Existing computational approaches have the following shortcomings: (i) the lack of an effective integration of heterogeneous features leads to a time-consuming training and even results in an over-fitted classifier; and (ii) the narrow consideration of predicting potential drug combinations only among known drugs having known combinations cannot meet the demand of realistic screenings, which pay more attention to potential combinative pairs among newly-coming drugs that have no approved combination with other drugs at all. RESULTS: In this paper, to tackle the above two problems, we propose a novel drug-driven approach for predicting potential combinative pairs on a large scale. We define four new features based on heterogeneous data and design an efficient fusion scheme to integrate these feature. Moreover importantly, we elaborate appropriate cross-validations towards realistic screening scenarios of drug combinations involving both known drugs and new drugs. In addition, we perform an extra investigation to show how each kind of heterogeneous features is related to combinative drug pairs. The investigation inspires the design of our approach. Experiments on real data demonstrate the effectiveness of our fusion scheme for integrating heterogeneous features and its predicting power in three scenarios of realistic screening. In terms of both AUC and AUPR, the prediction among known drugs achieves 0.954 and 0.821, that between known drugs and new drugs achieves 0.909 and 0.635, and that among new drugs achieves 0.809 and 0.592 respectively. CONCLUSIONS: Our approach provides not only an effective tool to integrate heterogeneous features but also the first tool to predict potential combinative pairs among new drugs.


Assuntos
Biologia Computacional/métodos , Combinação de Medicamentos , Avaliação Pré-Clínica de Medicamentos , Bases de Dados como Assunto , Humanos
19.
Sci Rep ; 7(1): 4536, 2017 07 03.
Artigo em Inglês | MEDLINE | ID: mdl-28674428

RESUMO

Although multilocus sequence typing (MLST) is highly discriminatory and useful for outbreak investigations and epidemiological surveillance, it has always been controversial whether clustering and phylogeny inferred from the MLST gene loci can represent the real phylogeny of bacterial strains. In this study, we compare the phylogenetic trees constructed using three approaches, (1) concatenated blocks of homologous sequence shared between the bacterial genomes, (2) genome single-nucleotide polymorphisms (SNP) profile and (3) concatenated nucleotide sequences of gene loci in the corresponding MLST schemes, for 10 bacterial species with >30 complete genome sequences available. Major differences in strain clustering at more than one position were observed between the phylogeny inferred using genome/SNP data and MLST for all 10 bacterial species. Shimodaira-Hasegawa test revealed significant difference between the topologies of the genome and MLST trees for nine of the 10 bacterial species, and significant difference between the topologies of the SNP and MLST trees were present for all 10 bacterial species. Matching Clusters and R-F Clusters metrics showed that the distances between the genome/SNP and MLST trees were larger than those between the SNP and genome trees. Phylogeny inferred from MLST failed to represent genome phylogeny with the same bacterial species.


Assuntos
Bactérias/classificação , Bactérias/genética , Tipagem de Sequências Multilocus , Filogenia , Genoma Bacteriano , Genômica/métodos , Tipagem de Sequências Multilocus/métodos , Polimorfismo de Nucleotídeo Único , Reprodutibilidade dos Testes , Análise de Sequência de DNA
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...