Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
PLoS Comput Biol ; 14(11): e1006547, 2018 11.
Artigo em Inglês | MEDLINE | ID: mdl-30383764

RESUMO

Protein or DNA motifs are sequence regions which possess biological importance. These regions are often highly conserved among homologous sequences. The generation of multiple sequence alignments (MSAs) with a correct alignment of the conserved sequence motifs is still difficult to achieve, due to the fact that the contribution of these typically short fragments is overshadowed by the rest of the sequence. Here we extended the PRALINE multiple sequence alignment program with a novel motif-aware MSA algorithm in order to address this shortcoming. This method can incorporate explicit information about the presence of externally provided sequence motifs, which is then used in the dynamic programming step by boosting the amino acid substitution matrix towards the motif. The strength of the boost is controlled by a parameter, α. Using a benchmark set of alignments we confirm that a good compromise can be found that improves the matching of motif regions while not significantly reducing the overall alignment quality. By estimating α on an unrelated set of reference alignments we find there is indeed a strong conservation signal for motifs. A number of typical but difficult MSA use cases are explored to exemplify the problems in correctly aligning functional sequence motifs and how the motif-aware alignment method can be employed to alleviate these problems.


Assuntos
Motivos de Aminoácidos , DNA/química , Proteínas/química , Alinhamento de Sequência/normas , Algoritmos , Sequência de Aminoácidos , Sequência Conservada , HIV-1/química , Homologia de Sequência de Aminoácidos , Produtos do Gene env do Vírus da Imunodeficiência Humana/química
2.
Methods Mol Biol ; 1525: 167-189, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-27896722

RESUMO

The increasing importance of Next Generation Sequencing (NGS) techniques has highlighted the key role of multiple sequence alignment (MSA) in comparative structure and function analysis of biological sequences. MSA often leads to fundamental biological insight into sequence-structure-function relationships of nucleotide or protein sequence families. Significant advances have been achieved in this field, and many useful tools have been developed for constructing alignments, although many biological and methodological issues are still open. This chapter first provides some background information and considerations associated with MSA techniques, concentrating on the alignment of protein sequences. Then, a practical overview of currently available methods and a description of their specific advantages and limitations are given, to serve as a helpful guide or starting point for researchers who aim to construct a reliable MSA.


Assuntos
Proteínas/química , Alinhamento de Sequência/métodos , Algoritmos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Filogenia , Proteínas/genética , Análise de Sequência de Proteína , Software
3.
Nucleic Acids Res ; 44(8): e72, 2016 05 05.
Artigo em Inglês | MEDLINE | ID: mdl-26721389

RESUMO

Eukaryotic gene expression is regulated by transcription factors (TFs) binding to promoter as well as distal enhancers. TFs recognize short, but specific binding sites (TFBSs) that are located within the promoter and enhancer regions. Functionally relevant TFBSs are often highly conserved during evolution leaving a strong phylogenetic signal. While multiple sequence alignment (MSA) is a potent tool to detect the phylogenetic signal, the current MSA implementations are optimized to align the maximum number of identical nucleotides. This approach might result in the omission of conserved motifs that contain interchangeable nucleotides such as the ETS motif (IUPAC code: GGAW). Here, we introduce ConBind, a novel method to enhance alignment of short motifs, even if their mutual sequence similarity is only partial. ConBind improves the identification of conserved TFBSs by improving the alignment accuracy of TFBS families within orthologous DNA sequences. Functional validation of the Gfi1b + 13 enhancer reveals that ConBind identifies additional functionally important ETS binding sites that were missed by all other tested alignment tools. In addition to the analysis of known regulatory regions, our web tool is useful for the analysis of TFBSs on so far unknown DNA regions identified through ChIP-sequencing.


Assuntos
Biologia Computacional/métodos , Proteínas de Ligação a DNA/metabolismo , Elementos Facilitadores Genéticos/genética , Regiões Promotoras Genéticas/genética , Alinhamento de Sequência/métodos , Fatores de Transcrição/metabolismo , Animais , Sequência de Bases , Sítios de Ligação/genética , Regulação da Expressão Gênica/genética , Humanos , Análise de Sequência de DNA
4.
PLoS One ; 10(5): e0127431, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25993129

RESUMO

Multiple Sequence Alignment (MSA) methods are typically benchmarked on sets of reference alignments. The quality of the alignment can then be represented by the sum-of-pairs (SP) or column (CS) scores, which measure the agreement between a reference and corresponding query alignment. Both the SP and CS scores treat mismatches between a query and reference alignment as equally bad, and do not take the separation into account between two amino acids in the query alignment, that should have been matched according to the reference alignment. This is significant since the magnitude of alignment shifts is often of relevance in biological analyses, including homology modeling and MSA refinement/manual alignment editing. In this study we develop a new alignment benchmark scoring scheme, SPdist, that takes the degree of discordance of mismatches into account by measuring the sequence distance between mismatched residue pairs in the query alignment. Using this new score along with the standard SP score, we investigate the discriminatory behavior of the new score by assessing how well six different MSA methods perform with respect to BAliBASE reference alignments. The SP score and the SPdist score yield very similar outcomes when the reference and query alignments are close. However, for more divergent reference alignments the SPdist score is able to distinguish between methods that keep alignments approximately close to the reference and those exhibiting larger shifts. We observed that by using SPdist together with SP scoring we were able to better delineate the alignment quality difference between alternative MSA methods. With a case study we exemplify why it is important, from a biological perspective, to consider the separation of mismatches. The SPdist scoring scheme has been implemented in the VerAlign web server (http://www.ibi.vu.nl/programs/veralignwww/). The code for calculating SPdist score is also available upon request.


Assuntos
Pareamento Incorreto de Bases , Alinhamento de Sequência , Benchmarking
5.
Methods Mol Biol ; 1079: 245-62, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24170407

RESUMO

Profile ALIgNmEnt (PRALINE) is a versatile multiple sequence alignment toolkit. In its main alignment protocol, PRALINE follows the global progressive alignment algorithm. It provides various alignment optimization strategies to address the different situations that call for protein multiple sequence alignment: global profile preprocessing, homology-extended alignment, secondary structure-guided alignment, and transmembrane aware alignment. A number of combinations of these strategies are enabled as well. PRALINE is accessible via the online server http://www.ibi.vu.nl/programs/PRALINEwww/. The server facilitates extensive visualization possibilities aiding the interpretation of alignments generated, which can be written out in pdf format for publication purposes. PRALINE also allows the sequences in the alignment to be represented in a dendrogram to show their mutual relationships according to the alignment. The chapter ends with a discussion of various issues occurring in multiple sequence alignment.


Assuntos
Biologia Computacional/métodos , Alinhamento de Sequência/métodos , Software , Membrana Celular/metabolismo , Internet , Estrutura Secundária de Proteína , Proteínas/química , Proteínas/metabolismo , Homologia de Sequência
6.
Proc Natl Acad Sci U S A ; 109(28): 11342-7, 2012 Jul 10.
Artigo em Inglês | MEDLINE | ID: mdl-22733768

RESUMO

Mycobacterial pathogens use specialized type VII secretion (T7S) systems to transport crucial virulence factors across their unusual cell envelope into infected host cells. These virulence factors lack classical secretion signals and the mechanism of substrate recognition is not well understood. Here we demonstrate that the model T7S substrates PE25/PPE41, which form a heterodimer, are targeted to the T7S pathway ESX-5 by a signal located in the C terminus of PE25. Site-directed mutagenesis of residues within this C terminus resulted in the identification of a highly conserved motif, i.e., YxxxD/E, which is required for secretion. This motif was also essential for the secretion of LipY, another ESX-5 substrate. Pathogenic mycobacteria have several different T7S systems and we identified a PE protein that is secreted by the ESX-1 system, which allowed us to compare substrate recognition of these two T7S systems. Surprisingly, this ESX-1 substrate contained a C-terminal signal functionally equivalent to that of PE25. Exchange of these C-terminal secretion signals between the PE proteins restored secretion, but each PE protein remained secreted via its own ESX secretion system, indicating that an additional signal(s) provides system specificity. Remarkably, the YxxxD/E motif was also present in and required for efficient secretion of the ESX-1 substrates CFP-10 and EspB. Therefore, our data show that the YxxxD/E motif is a general secretion signal that is present in all known mycobacterial T7S substrates or substrate complexes.


Assuntos
Antígenos de Bactérias/metabolismo , Mycobacterium marinum/genética , Mycobacterium tuberculosis/genética , Mycobacterium/metabolismo , Fatores de Virulência/metabolismo , Motivos de Aminoácidos , Sequência de Aminoácidos , Proteínas de Bactérias/metabolismo , Genoma Bacteriano , Modelos Biológicos , Dados de Sequência Molecular , Família Multigênica , Mycobacterium marinum/metabolismo , Mycobacterium tuberculosis/metabolismo , Estrutura Terciária de Proteína , Via Secretória , Homologia de Sequência de Aminoácidos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...