RESUMO
SUMMARY: ChIP-Seq data are a new challenge for motif discovery. Such a data typically consists of thousands of DNA segments with base-specific coverage values. We present a new version of our DNA motif discovery software ChIPMunk adapted for ChIP-Seq data. ChIPMunk is an iterative algorithm that combines greedy optimization with bootstrapping and uses coverage profiles as motif positional preferences. ChIPMunk does not require truncation of long DNA segments and it is practical for processing up to tens of thousands of data sequences. Comparison with traditional (MEME) or ChIP-Seq-oriented (HMS) motif discovery tools shows that ChIPMunk identifies the correct motifs with the same or better quality but works dramatically faster. AVAILABILITY AND IMPLEMENTATION: ChIPMunk is freely available within the ru_genetika Java package: http://line.imb.ac.ru/ChIPMunk. Web-based version is also available. CONTACT: ivan.kulakovskiy@gmail.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Imunoprecipitação da Cromatina/métodos , Análise de Sequência de DNA/métodos , Sequência de Bases , Sítios de Ligação , DNA/química , DNA/metabolismo , Bases de Dados FactuaisRESUMO
Micro- and minisatellites constitute an essential part of DNA with a low sequence complexity and carry several important functions. A search for tandem repeats in the human genome with a length of a repeat unit of up to 70 bp, including repeats with a great number of nucleotide substitutions, has been performed using the TaadeaSWAN program. It was shown that, for a considerable number of minisatellites with the length of the repeating unit of less than 25 nt, a shorter repeating motif can be distinguished in the sequence of this repeat, which often is similar to the sequence of minisatellites widely occurring in the human genome. A model of hierarchic origination of minisatellites in the human genome is suggested.