Search | VHL Regional Portal

Sriwastava, Brijesh K; Halder, Anup Kumar; Basu, Subhadip; Chakraborti, Tapabrata.

BMC Bioinformatics ; 24(1): 435, 2023 Nov 16.

Article in English | MEDLINE | ID: mdl-37974081

ABSTRACT

Biclustering of biologically meaningful binary information is essential in many applications related to drug discovery, like protein-protein interactions and gene expressions. However, for robust performance in recently emerging large health datasets, it is important for new biclustering algorithms to be scalable and fast. We present a rapid unsupervised biclustering (RUBic) algorithm that achieves this objective with a novel encoding and search strategy. RUBic significantly reduces the computational overhead on both synthetic and experimental datasets shows significant computational benefits, with respect to several state-of-the-art biclustering algorithms. In 100 synthetic binary datasets, our method took [Formula: see text] s to extract 494,872 biclusters. In the human PPI database of size [Formula: see text], our method generates 1840 biclusters in [Formula: see text] s. On a central nervous system embryonic tumor gene expression dataset of size 712,940, our algorithm takes 101 min to produce 747,069 biclusters, while the recent competing algorithms take significantly more time to produce the same result. RUBic is also evaluated on five different gene expression datasets and shows significant speed-up in execution time with respect to existing approaches to extract significant KEGG-enriched bi-clustering. RUBic can operate on two modes, base and flex, where base mode generates maximal biclusters and flex mode generates less number of clusters and faster based on their biological significance with respect to KEGG pathways. The code is available at ( https://github.com/CMATERJU-BIOINFO/RUBic ) for academic use only.

Subject(s)

Algorithms , Data Management , Humans , Databases, Factual , Cluster Analysis , Gene Expression Profiling/methods

Predicting Protein-Protein Interaction Sites with a Novel Membership Based Fuzzy SVM Classifier.

Sriwastava, Brijesh K; Basu, Subhadip; Maulik, Ujjwal.

IEEE/ACM Trans Comput Biol Bioinform ; 12(6): 1394-404, 2015.

Article in English | MEDLINE | ID: mdl-26684462

ABSTRACT

Predicting residues that participate in protein-protein interactions (PPI) helps to identify, which amino acids are located at the interface. In this paper, we show that the performance of the classical support vector machine (SVM) algorithm can further be improved with the use of a custom-designed fuzzy membership function, for the partner-specific PPI interface prediction problem. We evaluated the performances of both classical SVM and fuzzy SVM (F-SVM) on the PPI databases of three different model proteomes of Homo sapiens, Escherichia coli and Saccharomyces Cerevisiae and calculated the statistical significance of the developed F-SVM over classical SVM algorithm. We also compared our performance with the available state-of-the-art fuzzy methods in this domain and observed significant performance improvements. To predict interaction sites in protein complexes, local composition of amino acids together with their physico-chemical characteristics are used, where the F-SVM based prediction method exploits the membership function for each pair of sequence fragments. The average F-SVM performance (area under ROC curve) on the test samples in 10-fold cross validation experiment are measured as 77.07, 78.39, and 74.91 percent for the aforementioned organisms respectively. Performances on independent test sets are obtained as 72.09, 73.24 and 82.74 percent respectively. The software is available for free download from http://code.google.com/p/cmater-bioinfo.

Subject(s)

Algorithms , Fuzzy Logic , Models, Chemical , Pattern Recognition, Automated/methods , Protein Interaction Mapping/methods , Sequence Analysis, Protein/methods , Amino Acid Sequence , Binding Sites , Computer Simulation , Humans , Models, Statistical , Molecular Sequence Data , Protein Binding , Sequence Alignment/methods

PPIcons: identification of protein-protein interaction sites in selected organisms.

Sriwastava, Brijesh K; Basu, Subhadip; Maulik, Ujjwal; Plewczynski, Dariusz.

J Mol Model ; 19(9): 4059-70, 2013 Sep.

Article in English | MEDLINE | ID: mdl-23729008

ABSTRACT

The physico-chemical properties of interaction interfaces have a crucial role in characterization of protein-protein interactions (PPI). In silico prediction of participating amino acids helps to identify interface residues for further experimental verification using mutational analysis, or inhibition studies by screening library of ligands against given protein. Given the unbound structure of a protein and the fact that it forms a complex with another known protein, the objective of this work is to identify the residues that are involved in the interaction. We attempt to predict interaction sites in protein complexes using local composition of amino acids together with their physico-chemical characteristics. The local sequence segments (LSS) are dissected from the protein sequences using a sliding window of 21 amino acids. The list of LSSs is passed to the support vector machine (SVM) predictor, which identifies interacting residue pairs considering their inter-atom distances. We have analyzed three different model organisms of Escherichia coli, Saccharomyces Cerevisiae and Homo sapiens, where the numbers of considered hetero-complexes are equal to 40, 123 and 33 respectively. Moreover, the unified multi-organism PPI meta-predictor is also developed under the current work by combining the training databases of above organisms. The PPIcons interface residues prediction method is measured by the area under ROC curve (AUC) equal to 0.82, 0.75, 0.72 and 0.76 for the aforementioned organisms and the meta-predictor respectively.

Subject(s)

Computational Biology/methods , Models, Molecular , Protein Interaction Mapping , Protein Interaction Maps , Software , Amino Acids/chemistry , Binding Sites , Databases, Protein , Humans , Internet , Protein Interaction Mapping/methods , Reproducibility of Results , Support Vector Machine

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL