Pesquisa | Index Medicus Global

A Genetic Algorithm Approach for Semi-Supervised Clustering Algorithm

Pavithra, M.; Parvathi, R. M. S..

Indian J Med Sci ; 2018 SEP; 70(3): 5-12

Artigo | IMSEAR | ID: sea-196499

RESUMO

Introduction:A semi-supervised clustering algorithm is proposed that combines the benefits of supervised and unsupervised learningmethods. The approach allows unlabeled data with no known class to be used to improve classification accuracy [2]. The objectivefunction of an unsupervised technique, e.g. K-means clustering, is modified to minimize both the cluster dispersion of the inputattributes and a measure of cluster impurity based on the class labels. Minimizing the cluster dispersion of the examples is a form ofcapacity control to prevent over fitting [4]. For the output labels, impurity measures from decision tree algorithms such as the Gini indexcan be used. A genetic algorithm optimizes the objective function to produce clusters. Experimental results show that using classinformation improves the generalization ability compared to unsupervised methods based only on the input attributes [6]. Trainingusing information from unlabeled data can improve classification accuracy on that data as well. Genetic Algorithms (GAs) have beenwidely used in optimization problems for their high ability in seeking better and acceptable solutions within limited time. Clusteringensemble has emerged as another flavour of optimal solutions for generating more stable and robust partition from existing clusters [1].GAs has proved a major contribution to find consensus cluster partitions during clustering ensemble. Currently, web videocategorization has been an ever challenging research area with the popularity of the social web. In this paper, we propose a framework forweb video categorization using their textual features, video relations and web support [3]. There are three contributions in this researchwork. First, we expand the traditional Vector Space Model (VSM) in a more generic manner as Semantic VSM (S-VSM) by including thesemantic similarity between the features terms [5]. This new model has improved the clustering quality in terms of compactness (highintra-cluster similarity) and clearness (low inter-cluster similarity). Second, we optimize the clustering ensemble process with the helpof GA using a novel approach of the fitness function. We define a new measure, Pre-Paired Percentage (PPP), to be used as the fitnessfunction during the genetic cycle for optimization of clustering ensemble process [7]. Third, the most important and crucial step of theGA is to define the genetic operators, crossover and mutation. We express these operators by an intelligent mechanism of clusteringensemble. This approach has produced more logical offspring solutions [9]. Above stated all three contributions have shown remarkableresults in their corresponding areas. Experiments on real world social-web data have been performed to validate our new incrementalnovelties [8]

Knowledge-based analysis of functional impacts of mutations in microRNA seed regions.

Bhattacharya, Anindya; Cui, Yan.

J Biosci ; 2015 Oct; 40(4): 791-798

Artigo em Inglês | IMSEAR | ID: sea-181464

RESUMO

MicroRNAs are a class of important post-transcriptional regulators. Genetic and somatic mutations in miRNAs, especially those in the seed regions, have profound and broad impacts on gene expression and physiological and pathological processes. Over 500 SNPs were mapped to the miRNA seeds, which are located at position 2–8 of the mature miRNA sequences. We found that the central positions of the miRNA seeds contain fewer genetic variants and therefore are more evolutionary conserved than the peripheral positions in the seeds. We developed a knowledgebased method to analyse the functional impacts of mutations in miRNA seed regions. We computed the gene ontology-based similarity score GOSS and the GOSS percentile score for all 517 SNPs in miRNA seeds. In addition to the annotation of SNPs for their functional effects, in the present article we also present a detailed analysis pipeline for finding the key functional changes for seed SNPs. We performed a detailed gene ontology graph-based analysis of enriched functional categories for miRNA target gene sets. In the analysis of a SNP in the seed region of hsa-miR-96 we found that two key biological processes for progressive hearing loss ‘Neurotrophin TRK receptor signaling pathway’ and ‘Epidermal growth factor receptor signaling pathway’ were significantly and differentially enriched by the two sets of allele-specific target genes of miRNA hsa-miR-96.

Gene Set Analyses of Genome-Wide Association Studies on 49 Quantitative Traits Measured in a Single Genetic Epidemiology Dataset

Jihye KIM; Ji-Sun KWON; Sangsoo KIM.

Genomics & Informatics ; : 135-141, 2013.

Artigo em Inglês | WPRIM | ID: wpr-58523

RESUMO

Gene set analysis is a powerful tool for interpreting a genome-wide association study result and is gaining popularity these days. Comparison of the gene sets obtained for a variety of traits measured from a single genetic epidemiology dataset may give insights into the biological mechanisms underlying these traits. Based on the previously published single nucleotide polymorphism (SNP) genotype data on 8,842 individuals enrolled in the Korea Association Resource project, we performed a series of systematic genome-wide association analyses for 49 quantitative traits of basic epidemiological, anthropometric, or blood chemistry parameters. Each analysis result was subjected to subsequent gene set analyses based on Gene Ontology (GO) terms using gene set analysis software, GSA-SNP, identifying a set of GO terms significantly associated to each trait (pcorr < 0.05). Pairwise comparison of the traits in terms of the semantic similarity in their GO sets revealed surprising cases where phenotypically uncorrelated traits showed high similarity in terms of biological pathways. For example, the pH level was related to 7 other traits that showed low phenotypic correlations with it. A literature survey implies that these traits may be regulated partly by common pathways that involve neuronal or nerve systems.

Assuntos

Estudo de Associação Genômica Ampla , Genótipo , Concentração de Íons de Hidrogênio , Coreia (Geográfico) , Epidemiologia Molecular , Neurônios , Polimorfismo de Nucleotídeo Único , Semântica

RESUMO

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA