Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 20
Filter
Add more filters










Publication year range
1.
J Taibah Univ Med Sci ; 18(4): 787-801, 2023 Aug.
Article in English | MEDLINE | ID: mdl-36618881

ABSTRACT

Objective: The coronavirus disease 2019 (COVID-19) health crisis that began at the end of 2019 made researchers around the world quickly race to find effective solutions. Related literature exploded and it was inevitable that an automated approach was needed to find useful information, namely text mining, to overcome COVID-19, especially in terms of drug candidate discovery. While text mining methods for finding drug candidates mostly try to extract bioentity associations from PubMed, very few of them mine with a clustering approach. The purpose of this study was to demonstrate the effectiveness of our approach to identify drugs for the prevention of COVID-19 through literature review, cluster analysis, drug docking calculations, and clinical trial data. Methods: This research was conducted in four main stages. First, the text mining stage was carried out by involving Bidirectional Encoder Representations from Transformers for Biomedical to obtain vector representation of each word in the sentence from texts. The next stage generated the disease-drug associations, which were obtained from the correlation between disease and drug. Next, the clustering stage grouped the rules through the similarity of diseases by utilizing Term Frequency-Inverse Document Frequency as its feature. Finally, the drug candidate extraction stage was processed through leveraging PubChem and DrugBank databases. We further used the drug docking package AUTODOCK VINA in PyRx software to verify the results. Results: Comparative analyses showed that the percentage of findings using mining with clustering outperformed mining without clustering in all experimental settings. In addition, we suggest that the top three drugs/phytochemicals by drug docking analysis may be effective in preventing COVID-19. Conclusions: The proposed method for text mining utilizing the clustering method is quite promising in the discovery of drug candidates for the prevention of COVID-19 through the biomedical literature.

2.
PeerJ ; 10: e13137, 2022.
Article in English | MEDLINE | ID: mdl-35529499

ABSTRACT

Molecular networks are built up from genetic elements that exhibit feedback interactions. Here, we studied the problem of measuring the similarity of directed networks by proposing a novel alignment-free approach: the network subgraph-based approach. Our approach does not make use of randomized networks to determine modular patterns embedded in a network, and this method differs from the network motif and graphlet methods. Network similarity was quantified by gauging the difference between the subgraph frequency distributions of two networks using Jensen-Shannon entropy. We applied the subgraph approach to study three types of molecular networks, i.e., cancer networks, signal transduction networks, and cellular process networks, which exhibit diverse molecular functions. We compared the performance of our subgraph detection algorithm with other algorithms, and the results were consistent, but other algorithms could not address the issue of subgraphs/motifs embedded within a subgraph/motif. To evaluate the effectiveness of the subgraph-based method, we applied the method along with the Jensen-Shannon entropy to classify six network models, and it achieves a 100% accuracy of classification. The proposed information-theoretic approach allows us to determine the structural similarity of two networks regardless of node identity and network size. We demonstrated the effectiveness of the subgraph approach to cluster molecular networks that exhibit similar regulatory interaction topologies. As an illustration, our method can identify (i) common subgraph-mediated signal transduction and/or cellular processes in AML and pancreatic cancer, and (ii) scaffold proteins in gastric cancer and hepatocellular carcinoma; thus, the results suggested that there are common regulation modules for cancer formation. We also found that the underlying substructures of the molecular networks are dominated by irreducible subgraphs; this feature is valid for the three classes of molecular networks we studied. The subgraph-based approach provides a systematic scenario for analyzing, compare and classifying molecular networks with diverse functionalities.


Subject(s)
Algorithms , Neoplasms , Humans , Proteins/chemistry , Signal Transduction/physiology
3.
PeerJ ; 8: e9556, 2020.
Article in English | MEDLINE | ID: mdl-33005483

ABSTRACT

Biological processes are based on molecular networks, which exhibit biological functions through interactions of genetic elements or proteins. This study presents a graph-based method to characterize molecular networks by decomposing the networks into directed multigraphs: network subgraphs. Spectral graph theory, reciprocity and complexity measures were used to quantify the network subgraphs. Graph energy, reciprocity and cyclomatic complexity can optimally specify network subgraphs with some degree of degeneracy. Seventy-one molecular networks were analyzed from three network types: cancer networks, signal transduction networks, and cellular processes. Molecular networks are built from a finite number of subgraph patterns and subgraphs with large graph energies are not present, which implies a graph energy cutoff. In addition, certain subgraph patterns are absent from the three network types. Thus, the Shannon entropy of the subgraph frequency distribution is not maximal. Furthermore, frequently-observed subgraphs are irreducible graphs. These novel findings warrant further investigation and may lead to important applications. Finally, we observed that cancer-related cellular processes are enriched with subgraph-associated driver genes. Our study provides a systematic approach for dissecting biological networks and supports the conclusion that there are organizational principles underlying molecular networks.

5.
Comput Methods Programs Biomed ; 151: 159-170, 2017 11.
Article in English | MEDLINE | ID: mdl-28946998

ABSTRACT

BACKGROUND AND OBJECTIVE: Physiological signals such as electrocardiograms (ECG) and electromyograms (EMG) are widely used to diagnose diseases. Presently, the Internet offers numerous cloud storage services which enable digital physiological signals to be uploaded for convenient access and use. Numerous online databases of medical signals have been built. The data in them must be processed in a manner that preserves patients' confidentiality. METHODS: A reversible error-correcting-coding strategy will be adopted to transform digital physiological signals into a new bit-stream that uses a matrix in which is embedded the Hamming code to pass secret messages or private information. The shared keys are the matrix and the version of the Hamming code. RESULTS: An online open database, the MIT-BIH arrhythmia database, was used to test the proposed algorithms. The time-complexity, capacity and robustness are evaluated. Comparisons of several evaluations subject to related work are also proposed. CONCLUSIONS: This work proposes a reversible, low-payload steganographic scheme for preserving the privacy of physiological signals. An (n,  m)-hamming code is used to insert (n - m) secret bits into n bits of a cover signal. The number of embedded bits per modification is higher than in comparable methods, and the computational power is efficient and the scheme is secure. Unlike other Hamming-code based schemes, the proposed scheme is both reversible and blind.


Subject(s)
Computer Security , Confidentiality , Algorithms , Electrocardiography , Electronic Data Processing , Humans , Software
6.
J Bioinform Comput Biol ; 15(1): 1650043, 2017 Feb.
Article in English | MEDLINE | ID: mdl-28150521

ABSTRACT

Drug repurposing is a new method for disease treatments, which accelerates the identification of new uses for existing drugs with minimal side effects for patients. MicroRNA-based therapeutics are a class of drugs that have been used in gene therapy following the FDA's approval of the first anti-sense therapy. This study examines the effects of oxLDL on vascular smooth muscle cells (VSMCs) and identifies potential drugs and antimiRs for treating VSMC-associated diseases. The Connectivity Map (cMap) database is utilized to identify potential new uses of existing drugs. The success of the identifications was supported by MTT assay, clonogenic assay and clinical trial data. Specifically, 37 drugs, some of which are undergoing clinical trials, were identified. Three of the identified drugs exhibit IC50 activities. Among the 37 drugs' targets, three differentially expressed genes (DEGs) are identified as drug targets by using both the DrugBank and the NCBI PubChem Compound databases. Also, one DEG, DNMT1, which is regulated by 17 miRNAs, where these miRNAs are potential targets for developing antimiR-based miRNA therapy, is found.


Subject(s)
Drug Repositioning/methods , Gene Expression Regulation/drug effects , Lipoproteins, LDL/genetics , MicroRNAs , Cardiovascular Diseases/drug therapy , Cardiovascular Diseases/pathology , Cluster Analysis , Drug Discovery , Gene Ontology , Humans , Molecular Targeted Therapy , Muscle, Smooth, Vascular/cytology
7.
PeerJ ; 4: e2478, 2016.
Article in English | MEDLINE | ID: mdl-27703845

ABSTRACT

BACKGROUND: Abnormal proliferation of vascular smooth muscle cells (VSMC) is a major cause of cardiovascular diseases (CVDs). Many studies suggest that vascular injury triggers VSMC dedifferentiation, which results in VSMC changes from a contractile to a synthetic phenotype; however, the underlying molecular mechanisms are still unclear. METHODS: In this study, we examined how VSMC responds under mechanical stress by using time-course microarray data. A three-phase study was proposed to investigate the stress-induced differentially expressed genes (DEGs) in VSMC. First, DEGs were identified by using the moderated t-statistics test. Second, more DEGs were inferred by using the Gaussian Graphical Model (GGM). Finally, the topological parameters-based method and cluster analysis approach were employed to predict the last batch of DEGs. To identify the potential drugs for vascular diseases involve VSMC proliferation, the drug-gene interaction database, Connectivity Map (cMap) was employed. Success of the predictions were determined using in-vitro data, i.e. MTT and clonogenic assay. RESULTS: Based on the differential expression calculation, at least 23 DEGs were found, and the findings were qualified by previous studies on VSMC. The results of gene set enrichment analysis indicated that the most often found enriched biological processes are cell-cycle-related processes. Furthermore, more stress-induced genes, well supported by literature, were found by applying graph theory to the gene association network (GAN). Finally, we showed that by processing the cMap input queries with a cluster algorithm, we achieved a substantial increase in the number of potential drugs with experimental IC50 measurements. With this novel approach, we have not only successfully identified the DEGs, but also improved the DEGs prediction by performing the topological and cluster analysis. Moreover, the findings are remarkably validated and in line with the literature. Furthermore, the cMap and DrugBank resources were used to identify potential drugs and targeted genes for vascular diseases involve VSMC proliferation. Our findings are supported by in-vitro experimental IC50, binding activity data and clinical trials. CONCLUSION: This study provides a systematic strategy to discover potential drugs and target genes, by which we hope to shed light on the treatments of VSMC proliferation associated diseases.

8.
Comput Biol Chem ; 65: 154-164, 2016 Dec.
Article in English | MEDLINE | ID: mdl-27746113

ABSTRACT

Epigenetic regulation has been linked to the initiation and progression of cancer. Aberrant expression of microRNAs (miRNAs) is one such mechanism that can activate or silence oncogenes (OCGs) and tumor suppressor genes (TSGs) in cells. A growing number of studies suggest that miRNA expression can be regulated by methylation modification, thus triggering cancer development. However, there is no comprehensive in silico study concerning miRNA regulation by direct DNA methylation in cancer. Ovarian serous cystadenocarcinoma (OSC) was therefore chosen as a tumor model for the present work. Twelve batches of OSC data, with at least 35 patient samples in each batch, were obtained from The Cancer Genome Atlas (TCGA) database. The Spearman rank correlation coefficient (SRCC) was used to quantify the correlation between the CpG DNA methylation level and miRNA expression level. Meta-analysis was performed to reduce the effects of biological heterogeneity among different batches. MiRNA-target interactions were also inferred by computing SRCC and meta-analysis to assess the correlation between miRNA expression and cancer-associated gene expression and the interactions were further validated by a query against the miRTarBase database. A total of 26 potential epigenetic-regulated miRNA genes that can target OCGs or TSGs in OSC were found to show biological relevance between DNA methylation and miRNA gene expression. Furthermore, some of the identified DNA-methylated miRNA genes; for instance, the miR-200 family, were previously identified as epigenetic-regulated miRNAs and correlated with poor survival of ovarian cancer. We also found that several miRNA target genes, BTG3, NDN, HTRA3, CDC25A, and HMGA2 were also related to the poor outcomes in ovarian cancer. The present study proposed a systematic strategy to construct highly confident epigenetic-regulated miRNA pathways for OSC. The findings are validated and are in line with the literature. The inclusion of direct DNA methylated miRNA events may offer another layer of explanation that along with genetics can give a better understanding of the carcinogenesis process.


Subject(s)
Cystadenocarcinoma, Serous/metabolism , DNA Methylation , MicroRNAs/metabolism , Ovarian Neoplasms/metabolism , Cystadenocarcinoma, Serous/pathology , Female , Humans , Ovarian Neoplasms/pathology
9.
IET Syst Biol ; 10(2): 64-75, 2016 Apr.
Article in English | MEDLINE | ID: mdl-26997661

ABSTRACT

Protein complexes play an essential role in many biological processes. Complexes can interact with other complexes to form protein complex interaction network (PCIN) that involves in important cellular processes. There are relatively few studies on examining the interaction topology among protein complexes; and little is known about the stability of PCIN under perturbations. We employed graph theoretical approach to reveal hidden properties and features of four species PCINs. Two main issues are addressed, (i) the global and local network topological properties, and (ii) the stability of the networks under 12 types of perturbations. According to the topological parameter classification, we identified some critical protein complexes and validated that the topological analysis approach could provide meaningful biological interpretations of the protein complex systems. Through the Kolmogorov-Smimov test, we showed that local topological parameters are good indicators to characterise the structure of PCINs. We further demonstrated the effectiveness of the current approach by performing the scalability and data normalization tests. To measure the robustness of PCINs, we proposed to consider eight topological-based perturbations, which are specifically applicable in scenarios of targeted, sustained attacks. We found that the degree-based, betweenness-based and brokering-coefficient-based perturbations have the largest effect on network stability.


Subject(s)
Adaptation, Physiological/physiology , Models, Biological , Models, Statistical , Protein Interaction Mapping/methods , Proteome/metabolism , Signal Transduction/physiology , Algorithms , Animals , Computer Simulation , Humans
10.
BMC Bioinformatics ; 17 Suppl 1: 2, 2016 Jan 11.
Article in English | MEDLINE | ID: mdl-26817825

ABSTRACT

BACKGROUND: Non-small cell lung cancer (NSCLC) is one of the leading causes of death globally, and research into NSCLC has been accumulating steadily over several years. Drug repositioning is the current trend in the pharmaceutical industry for identifying potential new uses for existing drugs and accelerating the development process of drugs, as well as reducing side effects. RESULTS: This work integrates two approaches--machine learning algorithms and topological parameter-based classification--to develop a novel pipeline of drug repositioning to analyze four lung cancer microarray datasets, enriched biological processes, potential therapeutic drugs and targeted genes for NSCLC treatments. A total of 7 (8) and 11 (12) promising drugs (targeted genes) were discovered for treating early- and late-stage NSCLC, respectively. The effectiveness of these drugs is supported by the literature, experimentally determined in-vitro IC50 and clinical trials. This work provides better drug prediction accuracy than competitive research according to IC50 measurements. CONCLUSIONS: With the novel pipeline of drug repositioning, the discovery of enriched pathways and potential drugs related to NSCLC can provide insight into the key regulators of tumorigenesis and the treatment of NSCLC. Based on the verified effectiveness of the targeted drugs predicted by this pipeline, we suggest that our drug-finding pipeline is effective for repositioning drugs.


Subject(s)
Algorithms , Carcinoma, Non-Small-Cell Lung/drug therapy , Carcinoma, Non-Small-Cell Lung/genetics , Drug Repositioning , Machine Learning , Models, Theoretical , Neoplasm Proteins/genetics , Antineoplastic Agents/therapeutic use , Carcinoma, Non-Small-Cell Lung/pathology , Drug Discovery , Gene Expression Regulation, Neoplastic/drug effects , Humans , Lung Neoplasms/drug therapy , Lung Neoplasms/genetics , Lung Neoplasms/pathology , Microarray Analysis , Signal Transduction
11.
Article in English | MEDLINE | ID: mdl-26384373

ABSTRACT

Chromosomal translocation (CT) is of enormous clinical interest because this disorder is associated with various major solid tumors and leukemia. A tumor-specific fusion gene event may occur when a translocation joins two separate genes. Currently, various CT databases provide information about fusion genes and their genomic elements. However, no database of the roles of fusion genes, in terms of essential functional and regulatory elements in oncogenesis, is available. FARE-CAFE is a unique combination of CTs, fusion proteins, protein domains, domain-domain interactions, protein-protein interactions, transcription factors and microRNAs, with subsequent experimental information, which cannot be found in any other CT database. Genomic DNA information including, for example, manually collected exact locations of the first and second break points, sequences and karyotypes of fusion genes are included. FARE-CAFE will substantially facilitate the cancer biologist's mission of elucidating the pathogenesis of various types of cancer. This database will ultimately help to develop 'novel' therapeutic approaches. Database URL: http://ppi.bioinfo.asia.edu.tw/FARE-CAFE.


Subject(s)
DNA, Neoplasm , Databases, Genetic , Neoplasm Proteins , Neoplasms , Response Elements , Animals , DNA, Neoplasm/genetics , DNA, Neoplasm/metabolism , Humans , Neoplasm Proteins/genetics , Neoplasm Proteins/metabolism , Neoplasms/genetics , Neoplasms/metabolism
12.
Biomed Res Int ; 2015: 312047, 2015.
Article in English | MEDLINE | ID: mdl-25866773

ABSTRACT

Many proteins are known to be associated with cancer diseases. It is quite often that their precise functional role in disease pathogenesis remains unclear. A strategy to gain a better understanding of the function of these proteins is to make use of a combination of different aspects of proteomics data types. In this study, we extended Aragues's method by employing the protein-protein interaction (PPI) data, domain-domain interaction (DDI) data, weighted domain frequency score (DFS), and cancer linker degree (CLD) data to predict cancer proteins. Performances were benchmarked based on three kinds of experiments as follows: (I) using individual algorithm, (II) combining algorithms, and (III) combining the same classification types of algorithms. When compared with Aragues's method, our proposed methods, that is, machine learning algorithm and voting with the majority, are significantly superior in all seven performance measures. We demonstrated the accuracy of the proposed method on two independent datasets. The best algorithm can achieve a hit ratio of 89.4% and 72.8% for lung cancer dataset and lung cancer microarray study, respectively. It is anticipated that the current research could help understand disease mechanisms and diagnosis.


Subject(s)
Algorithms , Databases, Protein , Machine Learning , Neoplasm Proteins/genetics , Neoplasms/genetics , Animals , Humans , Protein Structure, Tertiary , Sequence Analysis, Protein
13.
BMC Syst Biol ; 9 Suppl 1: S5, 2015.
Article in English | MEDLINE | ID: mdl-25707690

ABSTRACT

BACKGROUND: Molecular networks are the basis of biological processes. Such networks can be decomposed into smaller modules, also known as network motifs. These motifs show interesting dynamical behaviors, in which co-operativity effects between the motif components play a critical role in human diseases. We have developed a motif-searching algorithm, which is able to identify common motif types from the cancer networks and signal transduction networks (STNs). Some of the network motifs are interconnected which can be merged together and form more complex structures, the so-called coupled motif structures (CMS). These structures exhibit mixed dynamical behavior, which may lead biological organisms to perform specific functions. RESULTS: In this study, we integrate transcription factors (TFs), microRNAs (miRNAs), miRNA targets and network motifs information to build the cancer-related TF-miRNA-motif networks (TMMN). This allows us to examine the role of network motifs in cancer formation at different levels of regulation, i.e. transcription initiation (TF → miRNA), gene-gene interaction (CMS), and post-transcriptional regulation (miRNA → target genes). Among the cancer networks and STNs we considered, it is found that there is a substantial amount of crosstalking through motif interconnections, in particular, the crosstalk between prostate cancer network and PI3K-Akt STN.To validate the role of network motifs in cancer formation, several examples are presented which demonstrated the effectiveness of the present approach. A web-based platform has been set up which can be accessed at: http://ppi.bioinfo.asia.edu.tw/pathway/. It is very likely that our results can supply very specific CMS missing information for certain cancer types, it is an indispensable tool for cancer biology research.


Subject(s)
Breast Neoplasms/genetics , Breast Neoplasms/pathology , Computational Biology , Gene Regulatory Networks , MicroRNAs/genetics , Signal Transduction , Transcription Factors/metabolism , Breast Neoplasms/metabolism , Humans
14.
Biomed Res Int ; 2014: 193817, 2014.
Article in English | MEDLINE | ID: mdl-25210704

ABSTRACT

Drug repositioning is a popular approach in the pharmaceutical industry for identifying potential new uses for existing drugs and accelerating the development time. Non-small-cell lung cancer (NSCLC) is one of the leading causes of death worldwide. To reduce the biological heterogeneity effects among different individuals, both normal and cancer tissues were taken from the same patient, hence allowing pairwise testing. By comparing early- and late-stage cancer patients, we can identify stage-specific NSCLC genes. Differentially expressed genes are clustered separately to form up- and downregulated communities that are used as queries to perform enrichment analysis. The results suggest that pathways for early- and late-stage cancers are different. Sets of up- and downregulated genes were submitted to the cMap web resource to identify potential drugs. To achieve high confidence drug prediction, multiple microarray experimental results were merged by performing meta-analysis. The results of a few drug findings are supported by MTT assay or clonogenic assay data. In conclusion, we have been able to assess the potential existing drugs to identify novel anticancer drugs, which may be helpful in drug repositioning discovery for NSCLC.


Subject(s)
Carcinoma, Non-Small-Cell Lung/drug therapy , Carcinoma, Non-Small-Cell Lung/genetics , Drug Discovery , Drug Repositioning , Antineoplastic Agents/therapeutic use , Carcinoma, Non-Small-Cell Lung/pathology , Cell Survival/drug effects , Gene Expression Regulation, Neoplastic/drug effects , Humans , Microarray Analysis , Neoplasm Proteins/biosynthesis , Neoplasm Staging , Signal Transduction/drug effects
15.
IET Syst Biol ; 8(2): 56-66, 2014 Apr.
Article in English | MEDLINE | ID: mdl-25014226

ABSTRACT

Lung cancer is one of the leading causes of death in both the USA and Taiwan, and it is thought that the cause of cancer could be because of the gain of function of an oncoprotein or the loss of function of a tumour suppressor protein. Consequently, these proteins are potential targets for drugs. In this study, differentially expressed genes are identified, via an expression dataset generated from lung adenocarcinoma tumour and adjacent non-tumour tissues. This study has integrated many complementary resources, that is, microarray, protein-protein interaction and protein complex. After constructing the lung cancer protein-protein interaction network (PPIN), the authors performed graph theory analysis of PPIN. Highly dense modules are identified, which are potential cancer-associated protein complexes. Up- and down-regulated communities were used as queries to perform functional enrichment analysis. Enriched biological processes and pathways are determined. These sets of up- and down-regulated genes were submitted to the Connectivity Map web resource to identify potential drugs. The authors' findings suggested that eight drugs from DrugBank and three drugs from NCBI can potentially reverse certain up- and down-regulated genes' expression. In conclusion, this study provides a systematic strategy to discover potential drugs and target genes for lung cancer.


Subject(s)
Carcinoma, Non-Small-Cell Lung/drug therapy , Computational Biology/methods , Lung Neoplasms/drug therapy , Oligonucleotide Array Sequence Analysis/methods , Adult , Aged , Aged, 80 and over , Antineoplastic Agents/chemistry , Carcinoma, Non-Small-Cell Lung/genetics , Carcinoma, Non-Small-Cell Lung/metabolism , Cell Line, Tumor , Cell Survival , Cluster Analysis , Computer Simulation , Drug Discovery , Gene Expression Profiling , Gene Expression Regulation, Neoplastic , Humans , Lung Neoplasms/genetics , Middle Aged , Signal Transduction , Technology, Pharmaceutical
16.
Amino Acids ; 46(4): 953-61, 2014 Apr.
Article in English | MEDLINE | ID: mdl-24385242

ABSTRACT

Plants are continuously subjected to infection by pathogens, including bacteria and viruses. Bacteria can inject a variety of effector proteins into the host to reprogram host defense mechanism. It is known that microRNAs participate in plant disease resistance to bacterial pathogens and previous studies have suggested that some bacterial effectors have evolved to disturb the host's microRNA-regulated pathways; and so enabling infection. In this study, the inter-species interaction between an Xanthomonas campestris pv campestris (Xcc) pathogen effector and Arabidopsis thaliana microRNA transcription promoter was investigated using three methods: (1) interolog, (2) alignment based on using transcription factor binding site profile matrix, and (3) the web-based binding site prediction tool, PATSER. Furthermore, we integrated another two data sets from our previous study into the present web-based system. These are (1) microRNA target genes and their downstream effects mediated by protein-protein interaction (PPI), and (2) the Xcc-Arabidopsis PPI information. This present work is probably the first comprehensive study of constructing pathways that comprises effector, microRNA, target genes and PPI for the study of pathogen-host interactions. It is expected that this study may help to elucidate the role of pathogen-host interplay in a plant's immune system. The database is freely accessible at: http://ppi.bioinfo.asia.edu.tw/EDMRP .


Subject(s)
Arabidopsis/genetics , Arabidopsis/microbiology , Bacterial Proteins/metabolism , MicroRNAs/genetics , Plant Diseases/microbiology , RNA, Plant/genetics , Xanthomonas campestris/metabolism , Arabidopsis/metabolism , Binding Sites , MicroRNAs/metabolism , Plant Diseases/genetics , Promoter Regions, Genetic , Protein Binding , RNA, Plant/metabolism
17.
Comput Biol Med ; 43(11): 1645-52, 2013 Nov.
Article in English | MEDLINE | ID: mdl-24209909

ABSTRACT

MicroRNAs are small, endogenous RNAs found in many different species and are known to have an influence on diverse biological phenomena. They also play crucial roles in plant biological processes, such as metabolism, leaf sidedness and flower development. However, the functional roles of most microRNAs are still unknown. The identification of closely related microRNAs and target genes can be an essential first step towards the discovery of their combinatorial effects on different cellular states. A lot of research has tried to discover microRNAs and target gene interactions by implementing machine learning classifiers with target prediction algorithms. However, high rates of false positives have been reported as a result of undetermined factors which will affect recognition. Therefore, integrating diverse techniques could improve the prediction. In this paper we propose identifying microRNAs target of Arabidopsis thaliana by integrating prediction scores from PITA, miRanda and RNAHybrid algorithms used as a feature vector of microRNA-target interactions, and then implementing SVM, random forest tree and neural network machine learning algorithms to make final predictions by majority voting. Furthermore, microRNA target genes are linked with their protein-protein interaction (PPI) partners. We focus on plant resistance genes and transcription factor information to provide new insights into plant pathogen interaction networks. Downstream pathways are characterized by the Jaccard coefficient, which is implemented based on Gene Ontology. The database is freely accessible at http://ppi.bioinfo.asia.edu.tw/At_miRNA/.


Subject(s)
Arabidopsis Proteins/genetics , Computational Biology/methods , MicroRNAs/genetics , Models, Statistical , Protein Interaction Maps/genetics , Support Vector Machine , Arabidopsis/genetics , Arabidopsis/metabolism , Arabidopsis/physiology , Arabidopsis Proteins/metabolism , Decision Trees , MicroRNAs/metabolism
18.
Comput Biol Med ; 43(9): 1196-204, 2013 Sep.
Article in English | MEDLINE | ID: mdl-23930814

ABSTRACT

Protein complex prediction approaches are based on the assumptions that complexes have dense protein-protein interactions and high functional similarity between their subunits. We investigated those assumptions by studying the subunits' interaction topology, sequence similarity and molecular function for human and yeast protein complexes. Inclusion of amino acids' physicochemical properties can provide better understanding of protein complex properties. Principal component analysis is carried out to determine the major features. Adopting amino acid composition profile information with the SVM classifier serves as an effective post-processing step for complexes classification. Improvement is based on primary sequence information only, which is easy to obtain.


Subject(s)
Saccharomyces cerevisiae Proteins/classification , Saccharomyces cerevisiae Proteins/genetics , Saccharomyces cerevisiae/genetics , Sequence Analysis, Protein/methods , Humans , Predictive Value of Tests
19.
Comput Biol Med ; 40(3): 300-5, 2010 Mar.
Article in English | MEDLINE | ID: mdl-20089249

ABSTRACT

A protein function pair approach, based on protein-protein interaction (PPI) data, is proposed to predict protein functions. Randomization tests are performed on the PPI dataset, which resulted in a protein function correlation scoring value which is used to rank the relative importance of a function pair. It has been found that certain classes of protein functions tend to be correlated together. Scoring values of these correlation pairs allow us to predict the functionality of a protein given that it interacts with proteins having well-defined function annotations. The jackknife test is used to validate the function pair method. The protein function pair approach achieves a prediction sensitivity comparable to an approach using more sophisticated method. The main advantages of this approach are as follows: (i) a set of function-function correlation relations are derived and intuitive biological interpretation can be achieved, and (ii) its simplicity, only two parameters are needed.


Subject(s)
Proteins/metabolism , Protein Binding
20.
Comput Biol Chem ; 32(2): 81-7, 2008 Apr.
Article in English | MEDLINE | ID: mdl-18082454

ABSTRACT

The domain combination pair approach is employed to derive putative protein domain-domain interactions (DDI) from the protein-protein interactions (PPI) database DIP. The results of putative DDI are computed for seven species. To determine the prediction performance, putative DDI results are compared with that of the database InterDom, where an average matching ratio of about 76% can be achieved. Several real PPI pathways are reconstructed based on the predicted DDI results. It is found that the pathways could be reconstructed with reasonable accuracy. Furthermore, a novel quantity, so called AP-order index, is introduced to predict the regulatory order for six PPI pathways. It is found that the AP-order index is a very reliable parameter to determine the regulatory order of PPI.


Subject(s)
Computational Biology/methods , Databases, Protein , Models, Biological , Protein Interaction Domains and Motifs/physiology , Protein Interaction Mapping/methods , Signal Transduction/physiology , Animals , Computational Biology/statistics & numerical data , Predictive Value of Tests , Protein Interaction Mapping/statistics & numerical data , Species Specificity
SELECTION OF CITATIONS
SEARCH DETAIL
...