Search | VHL Regional Portal

1.

Machine Learning Heuristics on Gingivobuccal Cancer Gene Datasets Reveals Key Candidate Attributes for Prognosis.

Singh, Tanvi; Malik, Girik; Someshwar, Saloni; Le, Hien Thi Thu; Polavarapu, Rathnagiri; Chavali, Laxmi N; Melethadathil, Nidheesh; Sundararajan, Vijayaraghava Seshadri; Valadi, Jayaraman; Kavi Kishor, P B; Suravajhala, Prashanth.

Genes (Basel) ; 13(12)2022 12 16.

Article in English | MEDLINE | ID: mdl-36553647

ABSTRACT

Delayed cancer detection is one of the common causes of poor prognosis in the case of many cancers, including cancers of the oral cavity. Despite the improvement and development of new and efficient gene therapy treatments, very little has been carried out to algorithmically assess the impedance of these carcinomas. In this work, from attributes or NCBI's oral cancer datasets, viz. (i) name, (ii) gene(s), (iii) protein change, (iv) condition(s), clinical significance (last reviewed). We sought to train the number of instances emerging from them. Further, we attempt to annotate viable attributes in oral cancer gene datasets for the identification of gingivobuccal cancer (GBC). We further apply supervised and unsupervised machine learning methods to the gene datasets, revealing key candidate attributes for GBC prognosis. Our work highlights the importance of automated identification of key genes responsible for GBC that could perhaps be easily replicated in other forms of oral cancer detection.

Subject(s)

Heuristics , Mouth Neoplasms , Humans , Machine Learning , Prognosis , Oncogenes , Mouth Neoplasms/diagnosis , Mouth Neoplasms/genetics

2.

Editorial: Integrated systems genomic approaches for characterizing uncharacterized proteins.

Valadi, Jayaraman; Sundararajan, Vijayaraghava Seshadri; Bandapalli, Obul Reddy; Benso, Alfredo; Suravajhala, Prashanth.

Front Genet ; 13: 1000825, 2022.

Article in English | MEDLINE | ID: mdl-36176288

3.

Hypothetical Proteins as Predecessors of Long Non-coding RNAs.

Malik, Girik; Agarwal, Tanu; Raj, Utkarsh; Sundararajan, Vijayaraghava Seshadri; Bandapalli, Obul Reddy; Suravajhala, Prashanth.

Curr Genomics ; 21(7): 531-535, 2020 Nov.

Article in English | MEDLINE | ID: mdl-33214769

ABSTRACT

Hypothetical Proteins [HP] are the transcripts predicted to be expressed in an organism, but no evidence of it exists in gene banks. On the other hand, long non-coding RNAs [lncRNAs] are the transcripts that might be present in the 5' UTR or intergenic regions of the genes whose lengths are above 200 bases. With the known unknown [KU] regions in the genomes rapidly existing in gene banks, there is a need to understand the role of open reading frames in the context of annotation. In this commentary, we emphasize that HPs could indeed be the predecessors of lncRNAs.

4.

A model to predict the function of hypothetical proteins through a nine-point classification scoring schema.

Ijaq, Johny; Malik, Girik; Kumar, Anuj; Das, Partha Sarathi; Meena, Narendra; Bethi, Neeraja; Sundararajan, Vijayaraghava Seshadri; Suravajhala, Prashanth.

BMC Bioinformatics ; 20(1): 14, 2019 Jan 08.

Article in English | MEDLINE | ID: mdl-30621574

ABSTRACT

BACKGROUND: Hypothetical proteins [HP] are those that are predicted to be expressed in an organism, but no evidence of their existence is known. In the recent past, annotation and curation efforts have helped overcome the challenge in understanding their diverse functions. Techniques to decipher sequence-structure-function relationship, especially in terms of functional modelling of the HPs have been developed by researchers, but using the features as classifiers for HPs has not been attempted. With the rise in number of annotation strategies, next-generation sequencing methods have provided further understanding the functions of HPs. RESULTS: In our previous work, we developed a six-point classification scoring schema with annotation pertaining to protein family scores, orthology, protein interaction/association studies, bidirectional best BLAST hits, sorting signals, known databases and visualizers which were used to validate protein interactions. In this study, we introduced three more classifiers to our annotation system, viz. pseudogenes linked to HPs, homology modelling and non-coding RNAs associated to HPs. We discuss the challenges and performance of these classifiers using machine learning heuristics with an improved accuracy from Perceptron (81.08 to 97.67), Naive Bayes (54.05 to 96.67), Decision tree J48 (67.57 to 97.00), and SMO_npolyk (59.46 to 96.67). CONCLUSION: With the introduction of three new classification features, the performance of the nine-point classification scoring schema has an improved accuracy to functionally annotate the HPs.

Subject(s)

Proteins/classification , Bayes Theorem , Humans

5.

A classification scoring schema to validate protein interactors.

Suravajhala, Prashanth; Sundararajan, Vijayaraghava Seshadri.

Bioinformation ; 8(1): 34-9, 2012.

Article in English | MEDLINE | ID: mdl-22359432

ABSTRACT

Hypothetical protein [HP] annotation poses a great challenge especially when the protein is putatively linked or mapped to another protein. With protein interaction networks (PIN) prevailing, many visualizers still remain unsupported to the HP annotation. Through this work, we propose a six-point classification system to validate protein interactions based on diverse features. The HP data-set was used as a training data-set to find putative functional interaction partners to the remaining proteins that are waiting to be interacting. A Total Reliability Score (TRS) was calculated based on the six-point classification which was evaluated using machine learning algorithm on a single node. We found that multilayer perceptron of neural network yielded 81.08% of accuracy in modelling TRS whereas feature selection algorithms confirmed that all classification features are implementable. Furthermore statistical results using variance and co-variance analyses confirmed the usefulness of these classification metrics. It has been evaluated that of all the classification features, subcellular location (sorting signals) makes higher impact in predicting the function of HPs.

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL