Efficient Mining of Interesting Patterns in Large Biological Sequences

Md-Mamunur RASHID; Md-Rezaul KARIM; Byeong-Soo JEONG; Ho-Jin CHOI

Md-Mamunur RASHID; Md-Rezaul KARIM; Byeong-Soo JEONG; Ho-Jin CHOI.

Genomics & Informatics ; : 44-50, 2012.

Article en En | WPRIM | ID: wpr-155515

Biblioteca responsable: WPRO

ABSTRACT

ABSTRACT

Pattern discovery in biological sequences (e.g., DNA sequences) is one of the most challenging tasks in computational biology and bioinformatics. So far, in most approaches, the number of occurrences is a major measure of determining whether a pattern is interesting or not. In computational biology, however, a pattern that is not frequent may still be considered very informative if its actual support frequency exceeds the prior expectation by a large margin. In this paper, we propose a new interesting measure that can provide meaningful biological information. We also propose an efficient index-based method for mining such interesting patterns. Experimental results show that our approach can find interesting patterns within an acceptable computation time.

Asunto(s)

Secuencia de Bases; Biología Computacional; ADN; Minería

Palabras clave

DNA sequence; index-based method; information gain; pattern mining

Texto completo

Añadir a Mi BVS

Imprimir

XML

Buscar en Google

Texto completo: 1 Índice: WPRIM Asunto principal: ADN / Secuencia de Bases / Biología Computacional / Minería Idioma: En Revista: Genomics & Informatics Año: 2012 Tipo del documento: Article

Texto completo

Añadir a Mi BVS

Imprimir

XML

Buscar en Google