Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 2 de 2
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Bioinformatics ; 25(1): 22-9, 2009 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-19008249

RESUMO

MOTIVATION: Cys(2)His(2) zinc finger (ZF) proteins represent the largest class of eukaryotic transcription factors. Their modular structure and well-conserved protein-DNA interface allow the development of computational approaches for predicting their DNA-binding preferences even when no binding sites are known for a particular protein. The 'canonical model' for ZF protein-DNA interaction consists of only four amino acid nucleotide contacts per zinc finger domain. RESULTS: We present an approach for predicting ZF binding based on support vector machines (SVMs). While most previous computational approaches have been based solely on examples of known ZF protein-DNA interactions, ours additionally incorporates information about protein-DNA pairs known to bind weakly or not at all. Moreover, SVMs with a linear kernel can naturally incorporate constraints about the relative binding affinities of protein-DNA pairs; this type of information has not been used previously in predicting ZF protein-DNA binding. Here, we build a high-quality literature-derived experimental database of ZF-DNA binding examples and utilize it to test both linear and polynomial kernels for predicting ZF protein-DNA binding on the basis of the canonical binding model. The polynomial SVM outperforms previously published prediction procedures as well as the linear SVM. This may indicate the presence of dependencies between contacts in the canonical binding model and suggests that modification of the underlying structural model may result in further improved performance in predicting ZF protein-DNA binding. Overall, this work demonstrates that methods incorporating information about non-binding and relative binding of protein-DNA pairs have great potential for effective prediction of protein-DNA interactions. AVAILABILITY: An online tool for predicting ZF DNA binding is available at http://compbio.cs.princeton.edu/zf/.


Assuntos
Cisteína/metabolismo , Proteínas de Ligação a DNA/química , Proteínas de Ligação a DNA/metabolismo , DNA/metabolismo , Histidina/metabolismo , Modelos Biológicos , Dedos de Zinco , Sequência de Aminoácidos , Sequência de Bases , Biologia Computacional , Bases de Dados de Ácidos Nucleicos , Proteína 1 de Resposta de Crescimento Precoce/química , Proteína 1 de Resposta de Crescimento Precoce/genética , Análise de Sequência com Séries de Oligonucleotídeos , Reprodutibilidade dos Testes
2.
Bioinformatics ; 20(18): 3516-25, 2004 Dec 12.
Artigo em Inglês | MEDLINE | ID: mdl-15297295

RESUMO

MOTIVATION: An important step in unravelling the transcriptional regulatory network of an organism is to identify, for each transcription factor, all of its DNA binding sites. Several approaches are commonly used in searching for a transcription factor's binding sites, including consensus sequences and position-specific scoring matrices. In addition, methods that compute the average number of nucleotide matches between a putative site and all known sites can be employed. Such basic approaches can all be naturally extended by incorporating pairwise nucleotide dependencies and per-position information content. In this paper, we evaluate the effectiveness of these basic approaches and their extensions in finding binding sites for a transcription factor of interest without erroneously identifying other genomic sequences. RESULTS: In cross-validation testing on a dataset of Escherichia coli transcription factors and their binding sites, we show that there are statistically significant differences in how well various methods identify transcription factor binding sites. The use of per-position information content improves the performance of all basic approaches. Furthermore, including local pairwise nucleotide dependencies within binding site models results in statistically significant performance improvements for approaches based on nucleotide matches. Based on our analysis, the best results when searching for DNA binding sites of a particular transcription factor are obtained by methods that incorporate both information content and local pairwise correlations. AVAILABILITY: The software is available at http://compbio.cs.princeton.edu/bindsites.


Assuntos
Algoritmos , DNA Bacteriano/química , Proteínas de Ligação a DNA/química , Escherichia coli/genética , Escherichia coli/metabolismo , Análise de Sequência de DNA/métodos , Fatores de Transcrição/química , Benchmarking , Sítios de Ligação , Ligação Proteica , Software , Validação de Programas de Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...