Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 11 de 11
Filter
Add more filters










Publication year range
1.
J Biomed Inform ; 149: 104570, 2024 01.
Article in English | MEDLINE | ID: mdl-38096944

ABSTRACT

Medication recommendation using Electronic Health Records (EHR) is challenging due to complex medical data. Current approaches extract longitudinal information from patient EHR to personalize recommendations. However, existing models often lack sufficient patient representation and overlook the importance of considering the similarity between a patient's medication records and specific medicines. Therefore, an Attention-guided Collaborative Decision Network (ACDNet) for medication recommendation is proposed in this paper. Specifically, ACDNet utilizes attention mechanism and Transformer to effectively capture patient health conditions and medication records by modeling their historical visits at both global and local levels. ACDNet also employs a collaborative decision framework, utilizing the similarity between medication records and medicine representation to facilitate the recommendation process. The experimental results on two extensive medical datasets, MIMIC-III and MIMIC-IV, clearly demonstrate that ACDNet outperforms state-of-the-art models in terms of Jaccard, PR-AUC, and F1 score, reaffirming its superiority. Moreover, the ablation experiments provide solid evidence of the effectiveness of each module in ACDNet, validating their contribution to the overall performance. Furthermore, a detailed case study reinforces the effectiveness of ACDNet in medication recommendation based on EHR data, showcasing its practical value in real-world healthcare scenarios.

2.
BMC Bioinformatics ; 21(1): 566, 2020 Dec 09.
Article in English | MEDLINE | ID: mdl-33297947

ABSTRACT

BACKGROUND: Drug repositioning has been an important and efficient method for discovering new uses of known drugs. Researchers have been limited to one certain type of collaborative filtering (CF) models for drug repositioning, like the neighborhood based approaches which are good at mining the local information contained in few strong drug-disease associations, or the latent factor based models which are effectively capture the global information shared by a majority of drug-disease associations. Few researchers have combined these two types of CF models to derive a hybrid model which can offer the advantages of both. Besides, the cold start problem has always been a major challenge in the field of computational drug repositioning, which restricts the inference ability of relevant models. RESULTS: Inspired by the memory network, we propose the hybrid attentional memory network (HAMN) model, a deep architecture combining two classes of CF models in a nonlinear manner. First, the memory unit and the attention mechanism are combined to generate a neighborhood contribution representation to capture the local structure of few strong drug-disease associations. Then a variant version of the autoencoder is used to extract the latent factor of drugs and diseases to capture the overall information shared by a majority of drug-disease associations. During this process, ancillary information of drugs and diseases can help alleviate the cold start problem. Finally, in the prediction stage, the neighborhood contribution representation is coupled with the drug latent factor and disease latent factor to produce predicted values. Comprehensive experimental results on two data sets demonstrate that our proposed HAMN model outperforms other comparison models based on the AUC, AUPR and HR indicators. CONCLUSIONS: Through the performance on two drug repositioning data sets, we believe that the HAMN model proposes a new solution to improve the prediction accuracy of drug-disease associations and give pharmaceutical personnel a new perspective to develop new drugs.


Subject(s)
Algorithms , Computational Biology/methods , Drug Repositioning , Databases as Topic , Humans , Statistics as Topic
3.
BMC Bioinformatics ; 20(1): 423, 2019 Aug 14.
Article in English | MEDLINE | ID: mdl-31412762

ABSTRACT

BACKGROUND: Computational drug repositioning, which aims to find new applications for existing drugs, is gaining more attention from the pharmaceutical companies due to its low attrition rate, reduced cost, and shorter timelines for novel drug discovery. Nowadays, a growing number of researchers are utilizing the concept of recommendation systems to answer the question of drug repositioning. Nevertheless, there still lie some challenges to be addressed: 1) Learning ability deficiencies; the adopted model cannot learn a higher level of drug-disease associations from the data. 2) Data sparseness limits the generalization ability of the model. 3)Model is easy to overfit if the effect of negative samples is not taken into consideration. RESULTS: In this study, we propose a novel method for computational drug repositioning, Additional Neural Matrix Factorization (ANMF). The ANMF model makes use of drug-drug similarities and disease-disease similarities to enhance the representation information of drugs and diseases in order to overcome the matter of data sparsity. By means of a variant version of the autoencoder, we were able to uncover the hidden features of both drugs and diseases. The extracted hidden features will then participate in a collaborative filtering process by incorporating the Generalized Matrix Factorization (GMF) method, which will ultimately give birth to a model with a stronger learning ability. Finally, negative sampling techniques are employed to strengthen the training set in order to minimize the likelihood of model overfitting. The experimental results on the Gottlieb and Cdataset datasets show that the performance of the ANMF model outperforms state-of-the-art methods. CONCLUSIONS: Through performance on two real-world datasets, we believe that the proposed model will certainly play a role in answering to the major challenge in drug repositioning, which lies in predicting and choosing new therapeutic indications to prospectively test for a drug of interest.


Subject(s)
Algorithms , Computational Biology/methods , Drug Repositioning , Databases as Topic , Drug Discovery , Humans , Models, Theoretical , Reproducibility of Results
4.
Biomed Res Int ; 2015: 253854, 2015.
Article in English | MEDLINE | ID: mdl-26000284

ABSTRACT

With the continuous development of biological experiment technology, more and more data related to uncertain biological networks needs to be analyzed. However, most of current alignment methods are designed for the deterministic biological network. Only a few can solve the probabilistic network alignment problem. However, these approaches only use the part of probabilistic data in the original networks allowing only one of the two networks to be probabilistic. To overcome the weakness of current approaches, an improved method called completely probabilistic biological network comparison alignment (C_PBNA) is proposed in this paper. This new method is designed for complete probabilistic biological network alignment based on probabilistic biological network alignment (PBNA) in order to take full advantage of the uncertain information of biological network. The degree of consistency (agreement) indicates that C_PBNA can find the results neglected by PBNA algorithm. Furthermore, the GO consistency (GOC) and global network alignment score (GNAS) have been selected as evaluation criteria, and all of them proved that C_PBNA can obtain more biologically significant results than those of PBNA algorithm.


Subject(s)
Computational Biology/methods , Gene Regulatory Networks , Uncertainty , Algorithms , Gene Ontology , Time Factors
5.
BMC Syst Biol ; 8 Suppl 3: S6, 2014.
Article in English | MEDLINE | ID: mdl-25350277

ABSTRACT

BACKGROUND: Motif mining has always been a hot research topic in bioinformatics. Most of current research on biological networks focuses on exact motif mining. However, due to the inevitable experimental error and noisy data, biological network data represented as the probability model could better reflect the authenticity and biological significance, therefore, it is more biological meaningful to discover probability motif in uncertain biological networks. One of the key steps in probability motif mining is frequent pattern discovery which is usually based on the possible world model having a relatively high computational complexity. METHODS: In this paper, we present a novel method for detecting frequent probability patterns based on circuit simulation in the uncertain biological networks. First, the partition based efficient search is applied to the non-tree like subgraph mining where the probability of occurrence in random networks is small. Then, an algorithm of probability isomorphic based on circuit simulation is proposed. The probability isomorphic combines the analysis of circuit topology structure with related physical properties of voltage in order to evaluate the probability isomorphism between probability subgraphs. The circuit simulation based probability isomorphic can avoid using traditional possible world model. Finally, based on the algorithm of probability subgraph isomorphism, two-step hierarchical clustering method is used to cluster subgraphs, and discover frequent probability patterns from the clusters. RESULTS: The experiment results on data sets of the Protein-Protein Interaction (PPI) networks and the transcriptional regulatory networks of E. coli and S. cerevisiae show that the proposed method can efficiently discover the frequent probability subgraphs. The discovered subgraphs in our study contain all probability motifs reported in the experiments published in other related papers. CONCLUSIONS: The algorithm of probability graph isomorphism evaluation based on circuit simulation method excludes most of subgraphs which are not probability isomorphism and reduces the search space of the probability isomorphism subgraphs using the mismatch values in the node voltage set. It is an innovative way to find the frequent probability patterns, which can be efficiently applied to probability motif discovery problems in the further studies.


Subject(s)
Algorithms , Computational Biology/methods , Pattern Recognition, Automated/methods , Uncertainty , Cluster Analysis , Computer Graphics , Escherichia coli/genetics , Escherichia coli/metabolism , Gene Regulatory Networks , Protein Interaction Mapping , Saccharomyces cerevisiae/genetics , Saccharomyces cerevisiae/metabolism
6.
Int J Data Min Bioinform ; 9(2): 172-98, 2014.
Article in English | MEDLINE | ID: mdl-24864377

ABSTRACT

Local protein structure prediction is one of important tasks for bioinformatics research. In order to further enhance the performance of local protein structure prediction, we propose the Multi-level Clustering Support Vector Machine Trees (MLSVMTs). Building on the multi-cluster tree structure, the MLSVMTs model uses multiple SVMs, each of which is customized to learn the unique sequence-to-structure relationship for one cluster. Both the combined 5 x 2 CV F test and the independent test show that the local structure prediction accuracy of MLSVMTs is significantly better than that of one-level K-means clustering, Multi-level clustering and Clustering Support Vector Machines.


Subject(s)
Databases, Protein , Protein Conformation , Proteins/chemistry , Proteins/genetics , Support Vector Machine
7.
BMC Bioinformatics ; 13 Suppl 10: S19, 2012 Jun 25.
Article in English | MEDLINE | ID: mdl-22759424

ABSTRACT

BACKGROUND: Most computational algorithms mainly focus on detecting highly connected subgraphs in PPI networks as protein complexes but ignore their inherent organization. Furthermore, many of these algorithms are computationally expensive. However, recent analysis indicates that experimentally detected protein complexes generally contain Core/attachment structures. METHODS: In this paper, a Greedy Search Method based on Core-Attachment structure (GSM-CA) is proposed. The GSM-CA method detects densely connected regions in large protein-protein interaction networks based on the edge weight and two criteria for determining core nodes and attachment nodes. The GSM-CA method improves the prediction accuracy compared to other similar module detection approaches, however it is computationally expensive. Many module detection approaches are based on the traditional hierarchical methods, which is also computationally inefficient because the hierarchical tree structure produced by these approaches cannot provide adequate information to identify whether a network belongs to a module structure or not. In order to speed up the computational process, the Greedy Search Method based on Fast Clustering (GSM-FC) is proposed in this work. The edge weight based GSM-FC method uses a greedy procedure to traverse all edges just once to separate the network into the suitable set of modules. RESULTS: The proposed methods are applied to the protein interaction network of S. cerevisiae. Experimental results indicate that many significant functional modules are detected, most of which match the known complexes. Results also demonstrate that the GSM-FC algorithm is faster and more accurate as compared to other competing algorithms. CONCLUSIONS: Based on the new edge weight definition, the proposed algorithm takes advantages of the greedy search procedure to separate the network into the suitable set of modules. Experimental analysis shows that the identified modules are statistically significant. The algorithm can reduce the computational time significantly while keeping high prediction accuracy.


Subject(s)
Algorithms , Cluster Analysis , Computational Biology/methods , Protein Interaction Mapping/methods , Data Mining , Protein Interaction Maps , Saccharomyces cerevisiae/metabolism
8.
Int J Data Min Bioinform ; 4(3): 316-30, 2010.
Article in English | MEDLINE | ID: mdl-20681482

ABSTRACT

Many algorithms or techniques to discover motifs require a predefined fixed window size in advance. Because of the fixed size, these approaches often deliver a number of similar motifs simply shifted by some bases or including mismatches. To confront the mismatched motifs problem, we use the super-rule concept to construct a Super-Rule-Tree (SRT) by a modified Hybrid Hierarchical K-means (HHK) clustering algorithm, which requires no parameter set-up to identify the similarities and dissimilarities between the motifs. By analysing the motif results generated by our approach, they are significant not only in sequence area but also in secondary structure similarity.


Subject(s)
Algorithms , Amino Acid Motifs , Proteins/chemistry , Sequence Analysis, Protein/methods , Amino Acid Sequence , Cluster Analysis , Pattern Recognition, Automated/methods , Protein Structure, Secondary
9.
Int J Comput Biol Drug Des ; 2(2): 187-203, 2009.
Article in English | MEDLINE | ID: mdl-20090170

ABSTRACT

In microarray data analysis, filter methods with low time complexity neglect correlation among genes. Metrics to calculate the correlation in some of the methods can not effectively reflect function similarity among genes and time complexity is based on the whole gene set. Therefore, a novel selection model called Mutual-Information-based Minimum Spanning Trees (MIMST) is proposed in this paper, which first uses filter methods to remove non-relevant genes, then computes the interdependence of top-ranked genes, and eliminates the redundant genes. The empirical results show that MIMST can find the smallest significant genes subset with higher classification accuracy compared with other methods.


Subject(s)
Computational Biology/methods , Data Interpretation, Statistical , Models, Genetic , Oligonucleotide Array Sequence Analysis/methods , Algorithms , Humans
10.
IEEE Trans Nanobioscience ; 6(2): 168-79, 2007 Jun.
Article in English | MEDLINE | ID: mdl-17695753

ABSTRACT

SecA is an important component of protein translocation in bacteria, and exists in soluble and membrane-integrated forms. Most membrane prediction programs predict SecA as being a soluble protein, with the exception of TMpred and Top-Pred. However, the membrane associated predicted segments by TMpred and TopPred are inconsistent across bacterial species in spite of high sequence homology. In this paper we describe a new method for membrane protein prediction, PSSM_SVM, which provides consistent results for integral membrane domains of SecAs across bacterial species. This PSSM encoding scheme demonstrates the highest accuracy in terms of Q2 among the common prediction methods, and produces consistent results on blind test data. None of the previously described methods showed this kind of consistency when tested against the same blind test set. This scheme predicts traditional transmembrane segments and most of the soluble proteins accurately. The PSSM scheme applied to the membrane-associated protein SecA shows characteristic features. In the set of 223 known SecA sequences, the PSSM_SVM prediction scheme predicts eight to nine residue embedded membrane segments. This predicted region is part of a 12 residue helix from known X-ray crystal structures of SecAs. This information could be important for determining the structure of SecA proteins in the membrane which have different conformational properties from other transmembrane proteins, as well as other soluble proteins that may similarly integrate into lipid bi-layers.


Subject(s)
Adenosine Triphosphatases/chemistry , Bacterial Proteins/chemistry , Cell Membrane/chemistry , Membrane Transport Proteins/chemistry , Models, Chemical , Models, Molecular , Sequence Analysis, Protein/methods , Adenosine Triphosphatases/metabolism , Amino Acid Sequence , Artificial Intelligence , Bacterial Proteins/metabolism , Computer Simulation , Membrane Transport Proteins/metabolism , Molecular Sequence Data , Pattern Recognition, Automated , SEC Translocation Channels , SecA Proteins , Solubility
11.
IEEE Trans Nanobioscience ; 5(1): 46-53, 2006 Mar.
Article in English | MEDLINE | ID: mdl-16570873

ABSTRACT

Support vector machines (SVMs) have shown strong generalization ability in a number of application areas, including protein structure prediction. However, the poor comprehensibility hinders the success of the SVM for protein structure prediction. The explanation of how a decision made is important for accepting the machine learning technology, especially for applications such as bioinformatics. The reasonable interpretation is not only useful to guide the "wet experiments," but also the extracted rules are helpful to integrate computational intelligence with symbolic AI systems for advanced deduction. On the other hand, a decision tree has good comprehensibility. In this paper, a novel approach to rule generation for protein secondary structure prediction by integrating merits of both the SVM and decision tree is presented. This approach combines the SVM with decision tree into a new algorithm called SVM_ DT, which proceeds in three steps. This algorithm first trains an SVM. Then, a new training set is generated through careful selection from the output of the SVM. Finally, the obtained training set is used to train a decision tree learning system and to extract the corresponding rule sets. The results of the experiments of protein secondary structure prediction on RS126 data set show that the comprehensibility of SVM_DT is much better than that of the SVM. Moreover, the generalization ability of SVM_DT is better than that of C4.5 decision trees and is similar to that of the SVM. Hence, SVM_DT can be used not only for prediction, but also for guiding biological experiments.


Subject(s)
Artificial Intelligence , Decision Support Techniques , Models, Chemical , Models, Molecular , Protein Structure, Secondary , Proteins/chemistry , Sequence Analysis, Protein/methods , Algorithms , Computer Simulation , Pattern Recognition, Automated , Proteins/ultrastructure
SELECTION OF CITATIONS
SEARCH DETAIL
...