Your browser doesn't support javascript.
loading
Support vector data description for finding non-coding RNA gene / 生物医学工程学杂志
Journal of Biomedical Engineering ; (6): 779-784, 2010.
Article in Chinese | WPRIM | ID: wpr-230785
ABSTRACT
In the field of computational molecule biology, there is still a challenging question of how to detect non-coding RNA gene in lots of unlabeled sequences. Generally, the methods of machine learning and classification are employed to answer this question. However, only a limited number of positive training samples and unlabeled samples are available. The negative samples are difficult to define appropriately, yet they are necessary for usual learning-then-classification method. The common way for most of the existing non-coding RNA gene finding methods is to produce a number of random sequences as negative samples, which may hold some characteristic of positive sample sequences. Consequently, the contrived uncertain factor was introduced and the performance of methods was not good enough. In this paper, Support Vector Data Description (SVDD) is in use for to learning and classification as well as for detecting non-coding RNA gene in lots of unlabeled sequences, and the k-means clustering algorithm is employed before SVDD training to deal with the high flase positive fault in the result of SVDD. The training samples (target samples) are non-coding RNA genes validated by experiment. Moreover, appropriate features were constructed by Principal Component Analysis (PCA). The effectiveness and performance of the method are demonstrated by testing the cases in NONCODE databases and E. coli genome.
Subject(s)
Full text: Available Index: WPRIM (Western Pacific) Main subject: Algorithms / Pattern Recognition, Automated / Cluster Analysis / RNA, Untranslated / Escherichia coli / Support Vector Machine / Genetics / Methods Limits: Humans Language: Chinese Journal: Journal of Biomedical Engineering Year: 2010 Type: Article

Similar

MEDLINE

...
LILACS

LIS

Full text: Available Index: WPRIM (Western Pacific) Main subject: Algorithms / Pattern Recognition, Automated / Cluster Analysis / RNA, Untranslated / Escherichia coli / Support Vector Machine / Genetics / Methods Limits: Humans Language: Chinese Journal: Journal of Biomedical Engineering Year: 2010 Type: Article