Efficient Mining of Interesting Patterns in Large Biological Sequences

Md-Mamunur RASHID; Md-Rezaul KARIM; Byeong-Soo JEONG; Ho-Jin CHOI

Md-Mamunur RASHID; Md-Rezaul KARIM; Byeong-Soo JEONG; Ho-Jin CHOI.

Genomics & Informatics ; : 44-50, 2012.

Artigo em Inglês | WPRIM | ID: wpr-155515

ABSTRACT

ABSTRACT

Pattern discovery in biological sequences (e.g., DNA sequences) is one of the most challenging tasks in computational biology and bioinformatics. So far, in most approaches, the number of occurrences is a major measure of determining whether a pattern is interesting or not. In computational biology, however, a pattern that is not frequent may still be considered very informative if its actual support frequency exceeds the prior expectation by a large margin. In this paper, we propose a new interesting measure that can provide meaningful biological information. We also propose an efficient index-based method for mining such interesting patterns. Experimental results show that our approach can find interesting patterns within an acceptable computation time.

Assuntos

Sequência de Bases; Biologia Computacional; DNA; Mineração

DNA sequence; index-based method; information gain; pattern mining

Texto completo

Imprimir

XML

Buscar no Google

Texto completo: DisponíveL Índice: WPRIM (Pacífico Ocidental) Assunto principal: DNA / Sequência de Bases / Biologia Computacional / Mineração Idioma: Inglês Revista: Genomics & Informatics Ano de publicação: 2012 Tipo de documento: Artigo

Similares

MEDLINE

LILACS

LIS

Texto completo

Imprimir

XML

Buscar no Google