Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 18 de 18
Filter
1.
J Comput Biol ; 30(8): 926-936, 2023 Aug.
Article in English | MEDLINE | ID: mdl-37466461

ABSTRACT

Clinical trials indicate that the dysregulation of microRNAs (miRNAs) is closely associated with the development of diseases. Therefore, predicting miRNA-disease associations is significant for studying the pathogenesis of diseases. Since traditional wet-lab methods are resource-intensive, cost-saving computational models can be an effective complementary tool in biological experiments. In this work, a locality-constrained linear coding is proposed to predict associations (ILLCEL). Among them, ILLCEL adopts miRNA sequence similarity, miRNA functional similarity, disease semantic similarity, and interaction profile similarity obtained by locality-constrained linear coding (LLC) as the priori information. Next, features and similarities extracted from multiperspectives are input to the ensemble learning framework to improve the comprehensiveness of the prediction. Significantly, the introduction of hypergraph-regular terms improves the accuracy of prediction by describing complex associations between samples. The results under fivefold cross validation indicate that ILLCEL achieves superior prediction performance. In case studies, known associations are accurately predicted and novel associations are verified in HMDD v3.2, miRCancer, and existing literature. It is concluded that ILLCEL can be served as a powerful tool for inferring potential associations.

2.
Article in English | MEDLINE | ID: mdl-37459265

ABSTRACT

Increasing microRNAs (miRNAs) have been confirmed to be inextricably linked to various diseases, and the discovery of their associations has become a routine way of treating diseases. To overcome the time-consuming and laborious shortcoming of traditional experiments in verifying the associations of miRNAs and diseases (MDAs), a variety of computational methods have emerged. However, these methods still have many shortcomings in terms of predictive performance and accuracy. In this study, a model based on multiple graph convolutional networks and random forest (MGCNRF) was proposed for the prediction MDAs. Specifically, MGCNRF first mapped miRNA functional similarity and sequence similarity, disease semantic similarity and target similarity, and the known MDAs into four different two-layer heterogeneous networks. Second, MGCNRF applied four heterogeneous networks into four different layered attention graph convolutional networks (GCNs), respectively, to extract MDA embeddings. Finally, MGCNRF integrated the embeddings of every MDA into the features of the miRNA-disease pair and predicted potential MDAs through the random forest (RF). Fivefold cross-validation was applied to verify the prediction performance of MGCNRF, which outperforms the other seven state-of-the-art methods by area under curve. Furthermore, the accuracy and the case studies of different diseases further demonstrate the scientific rationale of MGCNRF. In conclusion, MGCNRF can serve as a scientific tool for predicting potential MDAs.

3.
BMC Genomics ; 24(1): 426, 2023 Jul 29.
Article in English | MEDLINE | ID: mdl-37516822

ABSTRACT

Comprehensive analysis of multiple data sets can identify potential driver genes for various cancers. In recent years, driver gene discovery based on massive mutation data and gene interaction networks has attracted increasing attention, but there is still a need to explore combining functional and structural information of genes in protein interaction networks to identify driver genes. Therefore, we propose a network embedding framework combining functional and structural information to identify driver genes. Firstly, we combine the mutation data and gene interaction networks to construct mutation integration network using network propagation algorithm. Secondly, the struc2vec model is used for extracting gene features from the mutation integration network, which contains both gene's functional and structural information. Finally, machine learning algorithms are utilized to identify the driver genes. Compared with the previous four excellent methods, our method can find gene pairs that are distant from each other through structural similarities and has better performance in identifying driver genes for 12 cancers in the cancer genome atlas. At the same time, we also conduct a comparative analysis of three gene interaction networks, three gene standard sets, and five machine learning algorithms. Our framework provides a new perspective for feature selection to identify novel driver genes.


Subject(s)
Algorithms , Gene Regulatory Networks , Genetic Association Studies , Machine Learning , Protein Interaction Mapping
4.
IEEE J Biomed Health Inform ; 27(6): 2968-2979, 2023 06.
Article in English | MEDLINE | ID: mdl-37030856

ABSTRACT

In this study, we proposed a novel method called the graph capsule convolutional network (GCCN) to predict the progression from mild cognitive impairment to dementia and identify its pathogenesis. First, we proposed a novel risk gene discovery component to indirectly target genes with higher interactions with others. These risk genes and brain regions were collected as nodes to construct heterogeneous pathogenic information association graphs. Second, the graph capsules were established by projecting heterogeneous pathogenic information into a set of disentangled latent components. The orientation and length of capsules are representations of the format and intensity of pathogenic information. Third, graph capsule convolution network was used to model the information flows among pathogenic factors and elaborates the convergence of primary capsules to advanced capsules. The advanced capsule is a concept that organizes pathogenic information based on its consistency, and the synergistic effects of advanced capsules directed the development of the disease. Finally, discriminative pathogenic information flows were captured by a straightforward built-in interpretation mechanism, i.e., the dynamic routing mechanism, and applied to the identification of pathogenesis. GCCN has been experimentally shown to be significantly advanced on public datasets. Further experiments have shown that the pathogenic factors identified by GCCN are evidential and closely related to progressive mild cognitive impairment.


Subject(s)
Cognitive Dysfunction , Humans , Capsules , Cognitive Dysfunction/diagnostic imaging , Cognitive Dysfunction/genetics , Diagnostic Imaging
5.
Comput Biol Chem ; 104: 107862, 2023 Jun.
Article in English | MEDLINE | ID: mdl-37031647

ABSTRACT

Single-cell RNA sequencing technology provides a tremendous opportunity for studying disease mechanisms at the single-cell level. Cell type identification is a key step in the research of disease mechanisms. Many clustering algorithms have been proposed to identify cell types. Most clustering algorithms perform similarity calculation before cell clustering. Because clustering and similarity calculation are independent, a low-rank matrix obtained only by similarity calculation may be unable to fully reveal the patterns in single-cell data. In this study, to capture accurate single-cell clustering information, we propose a novel method based on a low-rank representation model, called KGLRR, that combines the low-rank representation approach with K-means clustering. The cluster centroid is updated as the cell dimension decreases to better from new clusters and improve the quality of clustering information. In addition, the low-rank representation model ignores local geometric information, so the graph regularization constraint is introduced. KGLRR is tested on both simulated and real single-cell datasets to validate the effectiveness of the new method. The experimental results show that KGLRR is more robust and accurate in cell type identification than other advanced algorithms.


Subject(s)
Algorithms , Cluster Analysis
6.
Genes (Basel) ; 13(12)2022 12 04.
Article in English | MEDLINE | ID: mdl-36553553

ABSTRACT

Simulation experiments are essential to evaluate epistasis detection methods, which is the main way to prove their effectiveness and move toward practical applications. However, due to the lack of effective simulators, especially for simulating models without marginal effects (eNME models), epistasis detection methods can hardly verify their effectiveness through simulation experiments. In this study, we propose a resampling simulation method (EpiReSIM) for generating the eNME model. First, EpiReSIM provides two strategies for solving eNME models. One is to calculate eNME models using prevalence constraints, and another is by joint constraints of prevalence and heritability. We transform the computation of the model into the problem of solving the under-determined system of equations. Introducing the complete orthogonal decomposition method and Newton's method, EpiReSIM calculates the solution of the underdetermined system of equations to obtain the eNME model, especially the solution of the high-order model, which is the highlight of EpiReSIM. Second, based on the computed eNME model, EpiReSIM generates simulation data by a resampling method. Experimental results show that EpiReSIM has advantages in preserving the biological properties of minor allele frequencies and calculating high-order models, and it is a convenient and effective alternative method for current simulation software.


Subject(s)
Epistasis, Genetic , Software , Computer Simulation
7.
Genes (Basel) ; 13(12)2022 12 18.
Article in English | MEDLINE | ID: mdl-36553670

ABSTRACT

Epistatic interactions are referred to as SNPs (single nucleotide polymorphisms) that affect disease development and trait expression nonlinearly, and hence identifying epistatic interactions plays a great role in explaining the pathogenesis and genetic heterogeneity of complex diseases. Many methods have been proposed for epistasis detection; nevertheless, they mainly focus on low-order epistatic interactions, two-order or three-order for instance, and often ignore high-order interactions due to computational burden. In this paper, a module detection method called MDSN is proposed for identifying high-order epistatic interactions. First, an SNP network is constructed by a construction strategy of interaction complementary, which consists of low-order SNP interactions that can be obtained from fast computations. Then, a node evaluation measure that integrates multi-topological features is proposed to improve the node expansion algorithm, where the importance of a node is comprehensively evaluated by the topological characteristics of the neighborhood. Finally, modules are detected in the constructed SNP network, which have high-order epistatic interactions associated with the disease. The MDSN was compared with four state-of-the-art methods on simulation datasets and a real Age-related Macular Degeneration dataset. The results demonstrate that MDSN has higher performance on detecting high-order interactions.


Subject(s)
Epistasis, Genetic , Genome-Wide Association Study , Genome-Wide Association Study/methods , Algorithms , Computer Simulation , Phenotype
8.
BMC Genomics ; 23(1): 686, 2022 Oct 05.
Article in English | MEDLINE | ID: mdl-36199016

ABSTRACT

BACKGROUND: MicroRNAs (miRNAs) have been confirmed to be inextricably linked to the emergence of human complex diseases. The identification of the disease-related miRNAs has gradually become a routine way to unveil the genetic mechanisms of examined disorders. METHODS: In this study, a method BLNIMDA based on a weighted bi-level network was proposed for predicting hidden associations between miRNAs and diseases. For this purpose, the known associations between miRNAs and diseases as well as integrated similarities between miRNAs and diseases are mapped into a bi-level network. Based on the developed bi-level network, the miRNA-disease associations (MDAs) are defined as strong associations, potential associations and no associations. Then, each miRNA-disease pair (MDP) is assigned two information properties according to the bidirectional information distribution strategy, i.e., associations of miRNA towards disease and vice-versa. Finally, two affinity weights for each MDP obtained from the information properties and the association type are then averaged as the final association score of the MDP. Highlights of the BLNIMDA lie in the definition of MDA types, and the introduction of affinity weights evaluation from the bidirectional information distribution strategy and defined association types, which ensure the comprehensiveness and accuracy of the final prediction score of MDAs. RESULTS: Five-fold cross-validation and leave-one-out cross-validation are used to evaluate the performance of the BLNIMDA. The results of the Area Under Curve show that the BLNIMDA has many advantages over the other seven selected computational methods. Furthermore, the case studies based on four common diseases and miRNAs prove that the BLNIMDA has good predictive performance. CONCLUSIONS: Therefore, the BLNIMDA is an effective method for predicting hidden MDAs.


Subject(s)
MicroRNAs , Algorithms , Computational Biology/methods , Genetic Predisposition to Disease , Humans , MicroRNAs/genetics
9.
Genes (Basel) ; 13(5)2022 05 12.
Article in English | MEDLINE | ID: mdl-35627256

ABSTRACT

In genome-wide association studies, epistasis detection is of great significance for the occurrence and diagnosis of complex human diseases, but it also faces challenges such as high dimensionality and a small data sample size. In order to cope with these challenges, several swarm intelligence methods have been introduced to identify epistasis in recent years. However, the existing methods still have some limitations, such as high-consumption and premature convergence. In this study, we proposed a multi-objective artificial bee colony (ABC) algorithm based on the scale-free network (SFMOABC). The SFMOABC incorporates the scale-free network into the ABC algorithm to guide the update and selection of solutions. In addition, the SFMOABC uses mutual information and the K2-Score of the Bayesian network as objective functions, and the opposition-based learning strategy is used to improve the search ability. Experiments were performed on both simulation datasets and a real dataset of age-related macular degeneration (AMD). The results of the simulation experiments showed that the SFMOABC has better detection power and efficiency than seven other epistasis detection methods. In the real AMD data experiment, most of the single nucleotide polymorphism combinations detected by the SFMOABC have been shown to be associated with AMD disease. Therefore, SFMOABC is a promising method for epistasis detection.


Subject(s)
Epistasis, Genetic , Macular Degeneration , Algorithms , Bayes Theorem , Genome-Wide Association Study , Humans , Macular Degeneration/diagnosis , Macular Degeneration/genetics , Polymorphism, Single Nucleotide/genetics
10.
Article in English | MEDLINE | ID: mdl-32857698

ABSTRACT

Hundreds of thousands of single nucleotide polymorphisms (SNPs)are currently available for genome-wide association study (GWAS). Detecting disease-associated SNP-SNP interactions is considered an important way to capture the underlying genetic causes of complex diseases. In the combinatorially explosive search space, evolutionary algorithms are promising in solving this difficult problem because of their controllable time complexity. However, in existing evolutionary algorithms, some possible SNP-SNP interactions are evaluated multiple times by the fitness function. Such reevaluations not only waste computing resources but also make these algorithms easy to fall into local optima. To tackle this drawback, a progressive screening memetic algorithm (PSMA)is proposed in the paper. PSMA first represents all possible SNP-SNP interactions in a constructed graph. Then, the proposed algorithm uses the progressive screening strategy to guarantee that every possible SNP-SNP interaction can only be evaluated once by reducing the constructed graph. Furthermore, two types of local search algorithms are introduced to enhance the detecting power of PSMA. For detecting disease-associated SNP-SNP interactions, experimental results show that our proposed method outperforms other existing state-of-the-art methods in terms of accuracy and time.


Subject(s)
Genome-Wide Association Study , Polymorphism, Single Nucleotide , Algorithms , Genome-Wide Association Study/methods , Polymorphism, Single Nucleotide/genetics
11.
Biomed Res Int ; 2020: 5610658, 2020.
Article in English | MEDLINE | ID: mdl-32908899

ABSTRACT

Detecting SNP-SNP interactions associated with disease is significant in genome-wide association study (GWAS). Owing to intensive computational burden and diversity of disease models, existing methods have drawbacks on low detection power and long running time. To tackle these drawbacks, a fast self-adaptive memetic algorithm (SAMA) is proposed in this paper. In this method, the crossover, mutation, and selection of standard memetic algorithm are improved to make SAMA adapt to the detection of SNP-SNP interactions associated with disease. Furthermore, a self-adaptive local search algorithm is introduced to enhance the detecting power of the proposed method. SAMA is evaluated on a variety of simulated datasets and a real-world biological dataset, and a comparative study between it and the other four methods (FHSA-SED, AntEpiSeeker, IEACO, and DESeeker) that have been developed recently based on evolutionary algorithms is performed. The results of extensive experiments show that SAMA outperforms the other four compared methods in terms of detection power and running time.


Subject(s)
Genome-Wide Association Study/methods , Polymorphism, Single Nucleotide/genetics , Algorithms , Computer Simulation , Epistasis, Genetic/genetics , Humans
12.
Genes (Basel) ; 10(2)2019 02 01.
Article in English | MEDLINE | ID: mdl-30717303

ABSTRACT

The epistatic interactions of single nucleotide polymorphisms (SNPs) are considered to be an important factor in determining the susceptibility of individuals to complex diseases. Although many methods have been proposed to detect such interactions, the development of detection algorithm is still ongoing due to the computational burden in large-scale association studies. In this paper, to deal with the intensive computing problem of detecting epistatic interactions in large-scale datasets, a self-adjusting ant colony optimization based on information entropy (IEACO) is proposed. The algorithm can automatically self-adjust the path selection strategy according to the real-time information entropy. The performance of IEACO is compared with that of ant colony optimization (ACO), AntEpiSeeker, AntMiner, and epiACO on a set of simulated datasets and a real genome-wide dataset. The results of extensive experiments show that the proposed method is superior to the other methods.


Subject(s)
Algorithms , Epistasis, Genetic , Genome-Wide Association Study/methods , Polymorphism, Single Nucleotide , Entropy , Humans , Macular Degeneration/genetics
13.
Entropy (Basel) ; 21(8)2019 Aug 06.
Article in English | MEDLINE | ID: mdl-33267479

ABSTRACT

Solving the constraint satisfaction problem (CSP) is to find an assignment of values to variables that satisfies a set of constraints. Ant colony optimization (ACO) is an efficient algorithm for solving CSPs. However, the existing ACO-based algorithms suffer from the constructed assignment with high cost. To improve the solution quality of ACO for solving CSPs, an ant colony optimization based on information entropy (ACOE) is proposed in this paper. The proposed algorithm can automatically call a crossover-based local search according to real-time information entropy. We first describe ACOE for solving CSPs and show how it constructs assignments. Then, we use a ranking-based strategy to update the pheromone, which weights the pheromone according to the rank of these ants. Furthermore, we introduce the crossover-based local search that uses a crossover operation to optimize the current best assignment. Finally, we compare ACOE with seven algorithms on binary CSPs. The experimental results revealed that our method outperformed the other compared algorithms in terms of the cost comparison, data distribution, convergence performance, and hypothesis test.

14.
Comput Biol Chem ; 77: 354-362, 2018 Dec.
Article in English | MEDLINE | ID: mdl-30466044

ABSTRACT

Single Nucleotide polymorphisms (SNPs) are usually used as biomarkers for research and analysis of genome-wide association study (GWAS). Moreover, the epistatic interaction of SNPs is an important factor in determining the susceptibility of individuals to complex diseases. Nowadays, the detection of epistatic interactions not only attracts attention of many researchers but also brings new challenges. It is of great significance to mine epistatic interactions from large-scale data for the combinatorial explosion problem of loci. Hence, it is necessary to improve an efficient algorithm for solving the problem. In this article, a novel ant colony optimization based on automatic adjustment mechanism (AA-ACO) is proposed. The mechanism automatically adjusts the behaviour of artificial ants according to the real-time feedback information so that the algorithm can run at its best. This study also compares AA-ACO with ACO, AntEpiSeeker, AntMiner, MACOED and epiACO in a set of simulated data sets and a real genome-wide data. As shown by the experimental results, the proposed algorithm is superior to the other algorithms.


Subject(s)
Algorithms , Epistasis, Genetic , Models, Genetic , Polymorphism, Single Nucleotide , Genetic Loci , Genetic Predisposition to Disease , Genome-Wide Association Study , Humans , Probability
15.
Int J Mol Sci ; 19(4)2018 Apr 13.
Article in English | MEDLINE | ID: mdl-29652791

ABSTRACT

Protein–ligand docking is a process of searching for the optimal binding conformation between the receptor and the ligand. Automated docking plays an important role in drug design, and an efficient search algorithm is needed to tackle the docking problem. To tackle the protein–ligand docking problem more efficiently, An ABC_DE_based hybrid algorithm (ADHDOCK), integrating artificial bee colony (ABC) algorithm and differential evolution (DE) algorithm, is proposed in the article. ADHDOCK applies an adaptive population partition (APP) mechanism to reasonably allocate the computational resources of the population in each iteration process, which helps the novel method make better use of the advantages of ABC and DE. The experiment tested fifty protein–ligand docking problems to compare the performance of ADHDOCK, ABC, DE, Lamarckian genetic algorithm (LGA), running history information guided genetic algorithm (HIGA), and swarm optimization for highly flexible protein–ligand docking (SODOCK). The results clearly exhibit the capability of ADHDOCK toward finding the lowest energy and the smallest root-mean-square deviation (RMSD) on most of the protein–ligand docking problems with respect to the other five algorithms.


Subject(s)
Molecular Docking Simulation/methods , Proteins/chemistry , Algorithms , Crystallography, X-Ray , Drug Design , Ligands , Models, Molecular , Proteins/metabolism
16.
Molecules ; 22(12)2017 Dec 15.
Article in English | MEDLINE | ID: mdl-29244750

ABSTRACT

Protein-ligand docking is an essential part of computer-aided drug design, and it identifies the binding patterns of proteins and ligands by computer simulation. Though Lamarckian genetic algorithm (LGA) has demonstrated excellent performance in terms of protein-ligand docking problems, it can not memorize the history information that it has accessed, rendering it effort-consuming to discover some promising solutions. This article illustrates a novel optimization algorithm (HIGA), which is based on LGA for solving the protein-ligand docking problems with an aim to overcome the drawback mentioned above. A running history information guided model, which includes CE crossover, ED mutation, and BSP tree, is applied in the method. The novel algorithm is more efficient to find the lowest energy of protein-ligand docking. We evaluate the performance of HIGA in comparison with GA, LGA, EDGA, CEPGA, SODOCK, and ABC, the results of which indicate that HIGA outperforms other search algorithms.


Subject(s)
Algorithms , Molecular Docking Simulation/methods , Proteins/chemistry , Binding Sites , Drug Design , Humans , Ligands , Molecular Structure , Protein Binding , Thermodynamics
17.
AMB Express ; 7(1): 174, 2017 Sep 13.
Article in English | MEDLINE | ID: mdl-28905320

ABSTRACT

Protein-ligand docking plays an important role in computer-aided pharmaceutical development. Protein-ligand docking can be defined as a search algorithm with a scoring function, whose aim is to determine the conformation of the ligand and the receptor with the lowest energy. Hence, to improve an efficient algorithm has become a very significant challenge. In this paper, a novel search algorithm based on crossover elitist preservation mechanism (CEP) for solving protein-ligand docking problems is proposed. The proposed algorithm, namely genetic algorithm with crossover elitist preservation (CEPGA), employ the CEP to keep the elite individuals of the last generation and make the crossover more efficient and robust. The performance of CEPGA is tested on sixteen molecular docking complexes from RCSB protein data bank. In comparison with GA, LGA and SODOCK in the aspects of lowest energy and highest accuracy, the results of which indicate that the CEPGA is a reliable and successful method for protein-ligand docking problems.

18.
J Comput Biol ; 23(7): 585-96, 2016 07.
Article in English | MEDLINE | ID: mdl-26895461

ABSTRACT

Protein-ligand docking can be formulated as a search algorithm associated with an accurate scoring function. However, most current search algorithms cannot show good performance in docking problems, especially for highly flexible docking. To overcome this drawback, this article presents a novel and robust optimization algorithm (EDGA) based on the Lamarckian genetic algorithm (LGA) for solving flexible protein-ligand docking problems. This method applies a population evolution direction-guided model of genetics, in which search direction evolves to the optimum solution. The method is more efficient to find the lowest energy of protein-ligand docking. We consider four search methods-a tradition genetic algorithm, LGA, SODOCK, and EDGA-and compare their performance in docking of six protein-ligand docking problems. The results show that EDGA is the most stable, reliable, and successful.


Subject(s)
Molecular Docking Simulation/methods , Proteins/metabolism , Algorithms , Evolution, Molecular , Ligands , Proteins/chemistry
SELECTION OF CITATIONS
SEARCH DETAIL
...