Pesquisa | Portal Regional da BVS

CLAGen: a tool for clustering and annotating gene sequences using a suffix tree algorithm.

Han, Sang il; Lee, Sung Gun; Kim, Kyung-Hoon; Choi, Chung Jung; Kim, Young Han; Hwang, Kyu Suk.

Biosystems ; 84(3): 175-82, 2006 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-16384634

RESUMO

Most multiple gene sequence alignment methods rely on conventions regarding the score of a multiple alignment in pairwise fashion. Therefore, as the number of sequences increases, the runtime of sequencing expands exponentially. In order to solve the problem, this paper presents a multiple sequence alignment method using a linear-time suffix tree algorithm to cluster similar sequences at one time without pairwise alignment. After searching for common subsequences, cross-matching common subsequences were generated, and sometimes inexact matching was found. So, a procedure aimed at masking the inexact cross-matching pairs was suggested here. In addition, BLAST was combined with a clustering tool in order to annotate the clusters generated by suffix tree clustering. The proposed method for clustering and annotating genes consists of the following steps: (1) construction of a suffix tree; (2) searching and overlapping common subsequences; (3) grouping subsequence pairs; (4) masking cross-matching pairs; (5) clustering gene sequences; (6) annotating gene clusters by the BLAST search. The performance of the proposed system, CLAGen, was successfully evaluated with 42 gene sequences in a TCA cycle (a citrate cycle) of bacteria. The system generated 11 clusters and found the longest subsequences of each cluster, which are biologically significant.

Assuntos

Algoritmos , Mapeamento Cromossômico/métodos , Família Multigênica/genética , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Software , Inteligência Artificial , Sequência de Bases , Documentação/métodos , Dados de Sequência Molecular , Reconhecimento Automatizado de Padrão/métodos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA