Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Bioinformatics ; 21(13): 2933-42, 2005 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-15860560

RESUMO

MOTIVATION: Promoter analysis is an essential step on the way to identify regulatory networks. A prerequisite for successful promoter analysis is the prediction of potential transcription factor binding sites (TFBS) with reasonable accuracy. The next steps in promoter analysis can be tackled only with reliable predictions, e.g. finding phylogenetically conserved patterns or identifying higher order combinations of sites in promoters of co-regulated genes. RESULTS: We present a new version of the program MatInspector that identifies TFBS in nucleotide sequences using a large library of weight matrices. By introducing a matrix family concept, optimized thresholds, and comparative analysis, the enhanced program produces concise results avoiding redundant and false-positive matches. We describe a number of programs based on MatInspector allowing in-depth promoter analysis (DiAlignTF, FrameWorker) and targeted design of regulatory sequences (SequenceShaper).


Assuntos
Algoritmos , Regiões Promotoras Genéticas/genética , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Fatores de Transcrição/genética , Transcrição Gênica/genética , Sequência de Bases , Sítios de Ligação , Sequência Conservada , Dados de Sequência Molecular , Ligação Proteica , Homologia de Sequência do Ácido Nucleico , Software
2.
In Silico Biol ; 2(1): S17-26, 2002.
Artigo em Inglês | MEDLINE | ID: mdl-11808874

RESUMO

Transcriptional regulation depends on the binding of transcription factors to their corresponding binding sites. The response to cellular signals is often mediated by the cooperative binding of transcription factors to well defined regulatory modules consisting of at least two transcription factor binding sites. Such regulatory modules can be responsible for the common regulation of genes within a gene class or confer a common function to promoters belonging to different gene classes. We developed in silico models representing a common framework of potential regulatory sites specific for one promoter class (actins). We also generated models for two different functional promoter modules both of which confer responsiveness to tumor necrosis factor (TNF) and interferon (IFN) to a variety of promoters. All models exhibited high selectivity, e.g. the mammalian muscle actin promoter model produced no false negatives in a database search.


Assuntos
Regulação da Expressão Gênica , Genes , Sequências Reguladoras de Ácido Nucleico , Software , Actinas/genética , Animais , Sítios de Ligação , Humanos , Interferons/genética , Músculo Esquelético/fisiologia , Regiões Promotoras Genéticas , Fatores de Transcrição/metabolismo , Fator de Necrose Tumoral alfa/genética
3.
Genome Res ; 11(3): 333-40, 2001 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-11230158

RESUMO

The publication of the first almost complete sequence of a human chromosome (chromosome 22) is a major milestone in human genomics. Together with the sequence, an excellent annotation of genes was published which certainly will serve as an information resource for numerous future projects. We noted that the annotation did not cover regulatory regions; in particular, no promoter annotation has been provided. Here we present an analysis of the complete published chromosome 22 sequence for promoters. A recent breakthrough in specific in silico prediction of promoter regions enabled us to attempt large-scale prediction of promoter regions on chromosome 22. Scanning of sequence databases revealed only 20 experimentally verified promoters, of which 10 were correctly predicted by our approach. Nearly 40% of our 465 predicted promoter regions are supported by the currently available gene annotation. Promoter finding also provides a biologically meaningful method for "chromosomal scaffolding", by which long genomic sequences can be divided into segments starting with a gene. As one example, the combination of promoter region prediction with exon/intron structure predictions greatly enhances the specificity of de novo gene finding. The present study demonstrates that it is possible to identify promoters in silico on the chromosomal level with sufficient reliability for experimental planning and indicates that a wealth of information about regulatory regions can be extracted from current large-scale (megabase) sequencing projects. Results are available on-line at http://genomatix.gsf.de/chr22/.


Assuntos
Cromossomos Humanos Par 22/genética , Biologia Computacional , Projeto Genoma Humano , Regiões Promotoras Genéticas/genética , Análise de Sequência de DNA , Algoritmos , Biologia Computacional/métodos , Biologia Computacional/tendências , Previsões , Humanos , Reprodutibilidade dos Testes , Análise de Sequência de DNA/tendências , Validação de Programas de Computador
4.
Bioinformatics ; 15(3): 180-6, 1999 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-10222404

RESUMO

MOTIVATION: Gene regulation often depends on functional modules which feature a detectable internal organization. Overall sequence similarity of these modules is often insufficient for detection by general search methods like FASTA or even Gapped BLAST. However, it is of interest to evaluate whether modules, often known from experimental analysis of single sequences, are present in other regulatory sequences. RESULTS: We developed a new method (FastM) which combines a search algorithm for individual transcription factor binding sites (MatInspector) with a distance correlation function. FastM allows fast definition of a model of correlated binding sites derived from as little as a single promoter or enhancer. ModelInspector results are suitable for evaluation of the significance of the model. We used FastM to define a model for the experimentally verified NFkappaB/IRF1 regulatory module from the major histocompatibility complex (MHC) class I HLA-B gene promoter. Analysis of a test set of sequences as well as database searches with this model showed excellent correlation of the model with the biological function of the module. These results could not be obtained by searches using FASTA or Gapped BLAST, which are based on sequence similarity. We were also able to demonstrate association of a hypothetical GRE-GRE module with viral sequences based on analysis of several GenBank sections with this module. AVAILABILITY: The WWW version of FastM is accessible at: http://www.gsf.de/cgi-bin/fastm. pl and http://genomatix.gsf.de/cgi-bin/fastm2/fastm.pl


Assuntos
Modelos Genéticos , Regiões Promotoras Genéticas , Alinhamento de Sequência/métodos , Software , Algoritmos , Sequência de Bases , Sítios de Ligação/genética , DNA/genética , Bases de Dados Factuais , Antígenos HLA/genética , Humanos , Interferon beta/genética , Dados de Sequência Molecular , Alinhamento de Sequência/estatística & dados numéricos , Homologia de Sequência do Ácido Nucleico , Fatores de Transcrição/metabolismo , Microglobulina beta-2/genética
5.
Bioinformatics ; 14(3): 290-4, 1998.
Artigo em Inglês | MEDLINE | ID: mdl-9614273

RESUMO

MOTIVATION: DIALIGN is a new method for pairwise as well as multiple alignment of nucleic acid and protein sequences. While standard alignment programs rely on comparing single residues and imposing gap penalties, DIALIGN constructs alignments by comparing whole segments of the sequences. No gap penalty is employed. This point of view is especially adequate if sequences are not globally related, but share only local similarities, as is the case in genomic DNA sequences and in many protein families. RESULTS: Using four different data sets, we show that DIALIGN is able correctly to align conserved motifs in protein sequences. Alignments produced by DIALIGN are compared systematically to the results of five other alignment programs. AVAILABILITY: DIALIGN is available to the scientific community free of charge for non-commercial use. Executables for various UNIX platforms including LINUX can be downloaded at http://www.gsf.de/biodv/dialign.html CONTACT: werner, morgenstern@gsf.de


Assuntos
Biologia Computacional/métodos , Alinhamento de Sequência/métodos , Algoritmos , Sequência de Aminoácidos , Sequência Conservada , Dados de Sequência Molecular , Análise de Sequência/métodos , Análise de Sequência de DNA/métodos , Homologia de Sequência de Aminoácidos , Validação de Programas de Computador
6.
In Silico Biol ; 1(1): 29-38, 1998.
Artigo em Inglês | MEDLINE | ID: mdl-11471240

RESUMO

Tissue-specific gene expression is governed by enhancer and promoter sequences determining the specificity most probably by their internal organization of transcription factor binding sites. In case of muscle-specific gene expression excellent compilations of sequence regions responsible for the tissue-specificity are available. We took advantage of such a compilation in order to elucidate organizational features that are directly correlated with promoter specificity. We chose a systematic approach solely based on a sequence collection known to consist of specific regulatory regions which can in principle be applied to every precompiled set of such sequences. We were able to show that these sequences contained a detectable subgroup (actin promoters) for which it was possible to construct a highly specific promoter model recognizing the majority of all known actin sequences. The model was robust with respect to different training sets, almost 100% specific and sensitive enough to be suitable for database searches. We believe this pilot study demonstrates the general applicability of our approach as well as the concept of modular promoter organization.


Assuntos
Actinas/genética , Simulação por Computador , Modelos Genéticos , Músculos/metabolismo , Regiões Promotoras Genéticas , Animais , Sítios de Ligação/genética , Bases de Dados Factuais , Expressão Gênica , Humanos , Filogenia , Distribuição Tecidual , Fatores de Transcrição/metabolismo
7.
J Mol Biol ; 270(5): 674-87, 1997 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-9245596

RESUMO

Functional promoters are composed of individual modules (e.g. transcription factor binding sites, secondary structure elements, repeats) arranged in distinct patterns. Recognition of such patterns is essential for identification of promoters in non-coding sequences. However, this is difficult due to the absence of overall sequence similarity in promoters even if they are regulated in a similar way. We implemented simple formal representations of general features of regulatory regions into an algorithm capable of developing complex models reflecting both the element composition and the functional organization of individual elements (ModelGenerator). Though ModelGenerator requires a very simple initial model (e.g. two modules and their relative order) it will generate a much more sophisticated model by analysis of the training set of at least ten sequences. We show ModelGenerator to successfully model different retroviral long terminal repeat (LTR) classes (Lentivirus as well as avian and mammalian C-type) which contain functional promoters. Database searches with the program ModelInspector demonstrated the high specificity of these models and no apparent false negatives were detected. We also verified one match from GenBank to the mammalian C-type LTR model experimentally and showed this sequence to contain an active promoter. Thus, the concept of modular organization of functional regulatory DNA regions (e.g. promoters) could be successfully implemented into a set of computer tools which might be flexible and specific enough to be suitable for prospective analysis of new genomic DNA sequences.


Assuntos
Algoritmos , Bases de Dados Factuais , Regiões Promotoras Genéticas , Sequências Repetitivas de Ácido Nucleico , Retroviridae/genética , Animais , Humanos , Mamíferos , Modelos Genéticos , Transcrição Gênica , Células Tumorais Cultivadas
8.
Genomics ; 43(1): 52-61, 1997 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-9226372

RESUMO

The current genome sequencing projects reveal megabases of unknown genomic sequences. About 1% of these sequences can be expected to be of retroviral origin. These are often severely deleted or mutated. Therefore, identification of the retroviral origin of these sequences can be very difficult due to the absence of convincing overall sequence similarity. There are also many copies of solo-LTRs (long terminal repeats) distributed throughout genomic sequences. LTR and envelope sequences in general are among the most divergent parts of the retroviral genome and thus especially hard to detect in mutated endogenous sequences. We took advantage of the fact that these retroviral sections contain short highly conserved sequence regions providing retroviral hallmarks even after loss of overall similarity. We defined several sequence elements and peptide motifs within LTR and Env sequences and used these elements to construct models for LTRs and Env proteins of mammalian C-type retroviruses. We then used this strategy to identify successfully the hitherto missing LTRs and an env-like region in the S71 human retroviral sequence. Our approach provides a new strategy for identifying remotely related retroviral sequences in genomic DNA (especially human DNA), of potential significance for the interpretation of genomic sequences obtained from the current large-scale sequencing projects.


Assuntos
Genoma Humano , Genoma Viral , Provírus/genética , Retroviridae/genética , Sequência de Aminoácidos , Animais , Sequência de Bases , DNA/genética , DNA/isolamento & purificação , Primers do DNA/genética , DNA Viral/genética , DNA Viral/isolamento & purificação , Gammaretrovirus/genética , Genes env , Humanos , Camundongos , Dados de Sequência Molecular , Reação em Cadeia da Polimerase , Sequências Repetitivas de Ácido Nucleico , Homologia de Sequência de Aminoácidos , Homologia de Sequência do Ácido Nucleico
11.
Comput Appl Biosci ; 13(1): 89-97, 1997 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-9088714

RESUMO

The detection of transcription control elements in DNA sequences became both more important and more complicated by the completion of the first full genome sequencing projects. Rapid evaluation of potential regulatory elements in large amounts of sequence data requires specific methods preferably available as user-friendly computer programs. However, many more algorithms and methods have been published than programs are available, creating problems for scientists who try to select an appropriate method for their needs from the literature. The Internet provides a worldwide and relatively easy access to computer software if the user knows where to look. One of the major problems remaining is how to find the appropriate software. We have compiled a guide detailing where software is available and what is to be expected in terms of interface and data compatibility with other programs. We also show results obtained with each program for several examples. The summarized features of each program should allow scientists to select quickly the method of their choice and inform them where to download the software.


Assuntos
Genes Reguladores , Análise de Sequência de DNA/métodos , Software , Algoritmos , Redes de Comunicação de Computadores , Bases de Dados Factuais , Estudos de Avaliação como Assunto , Análise de Sequência de DNA/estatística & dados numéricos , Interface Usuário-Computador
13.
Pac Symp Biocomput ; : 151-62, 1997.
Artigo em Inglês | MEDLINE | ID: mdl-9390288

RESUMO

Transcriptional control regions are usually composed of a complex arrangement of individual transcriptional elements like protein binding sites. This modular structure allows generation of enormous functional diversity of regulatory regions with a limited set of individual elements. We implemented simple formal representations of these general features of regulatory regions into an algorithm capable of developing complex models reflecting both the element composition and the functional organization of individual elements. Our method (ModelGenerator) requires a training set of at least 10 sequences containing the regulatory regions to be modelled and a very simple initial model which may consist of just two characteristic transcription factor binding sites. We show the capability of our algorithm to expand the initial model solely by comparative sequence analysis leading to complex, biologically meaningful models. A second program (ModelInspector) is capable to scan new sequence data for matches to models defined by ModelGenerator. We show two models for retroviral transcriptional control regions to be highly specific. A search against GenBank using one of the models is shown to be free of false negatives and to produce less than 2 false positives/million nucleotides. Thus, our algorithms appear to be useful tools for the analysis of extremely long genomic sequences which are now becoming available as results of various genome sequencing projects.


Assuntos
Simulação por Computador , DNA/química , DNA/genética , Modelos Moleculares , Sequências Reguladoras de Ácido Nucleico , Transcrição Gênica , Sequência de Bases , Sítios de Ligação , Sequência Consenso , Proteínas de Ligação a DNA/metabolismo , Regulação da Expressão Gênica , Lentivirus/genética , Dados de Sequência Molecular , Regiões Promotoras Genéticas , Sequências Repetitivas de Ácido Nucleico , Reprodutibilidade dos Testes , Retroviridae/genética , Sensibilidade e Especificidade , Software , TATA Box
14.
Virology ; 224(1): 256-67, 1996 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-8862420

RESUMO

Retroviruses are expressed under the control of viral control regions designated long terminal repeats (LTRs), which contain all signals for transcriptional initiation as well as transcriptional termination. However, retroviral LTRs from different species within a common genus, such as Lentivirus, do not show significant overall sequence homology. We compiled a model of the functional organization of 20 Lentivirus LTRs which we show to recognize all known Lentivirus LTRs. To this end we combined our previously published methods for identification of transcription elements with secondary structure element analysis in a novel modular approach. We deduced descriptions for three new Lentivirus-specific sequence elements present in most of the Lentivirus LTRs but absent in LTRs of other retrovirus families (B, C, D-type, BLV-HTLV, Spuma). Four of the 10 elements defined in our study were primate-specific. We were able to deduce a phylogeny based on our model which agrees in general with the phylogeny derived from the polymerase genes of these viruses. Our model indicated that more than 100 LTRs from the databases are of Lentivirus origin and can be clearly separated from all other LTR types (B, C, D, BLV-HTLV, Spuma). This selectivity appears to be a unique feature of our modular approach.


Assuntos
Sequência Consenso , Lentivirus/genética , Sequências Repetitivas de Ácido Nucleico , Animais , Sequência de Bases , DNA Viral , Humanos , Lentivirus/classificação , Dados de Sequência Molecular , Filogenia
15.
Comput Appl Biosci ; 12(1): 71-80, 1996 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-8670622

RESUMO

We present an algorithm to identify potential functional elements like protein binding sites in DNA sequences, solely from nucleotide sequence data. Prerequisites are a set of at least seven not closely related sequences with a common biological function which is correlated to one or more unknown sequence elements present in most but not necessarily all of the sequences. The algorithm is based on a search for n-tuples which occur at least in a minimum percentage of the sequences with no or one mismatch, which may be at any position of the tuple. In contrast to functional tuples, random tuples show no preferred pattern of mismatch locations within the tuple nor is the conservation extended beyond the tuple. Both features of functional tuples are used to eliminate random tuples. Selection is carried out by maximization of the information content first for the n-tuple, then for a region containing the tuple and finally for the complete binding site. Further matches are found in an additional selection step, using the ConsInd method previously described. The algorithm is capable of identifying and delimiting elements (e.g. protein binding sites) represented by single short cores (e.g. TATA box) in sets of unaligned sequences of about 500 nucleotides using no information other than the nucleotide sequences. Furthermore, we show its ability to identify multiple elements in a set of complete LTR sequences (more than 600 nucleotides per sequence).


Assuntos
Algoritmos , DNA/química , DNA/genética , Análise de Sequência de DNA/métodos , Software , Sequência de Bases , Sequência Consenso , Sequência Conservada , Bases de Dados Factuais , Estudos de Avaliação como Assunto , Dados de Sequência Molecular , Análise de Sequência de DNA/estatística & dados numéricos
16.
Nucleic Acids Res ; 23(23): 4878-84, 1995 Dec 11.
Artigo em Inglês | MEDLINE | ID: mdl-8532532

RESUMO

The identification of potential regulatory motifs in new sequence data is increasingly important for experimental design. Those motifs are commonly located by matches to IUPAC strings derived from consensus sequences. Although this method is simple and widely used, a major drawback of IUPAC strings is that they necessarily remove much of the information originally present in the set of sequences. Nucleotide distribution matrices retain most of the information and are thus better suited to evaluate new potential sites. However, sufficiently large libraries of pre-compiled matrices are a prerequisite for practical application of any matrix-based approach and are just beginning to emerge. Here we present a set of tools for molecular biologists that allows generation of new matrices and detection of potential sequence matches by automatic searches with a library of pre-compiled matrices. We also supply a large library (> 200) of transcription factor binding site matrices that has been compiled on the basis of published matrices as well as entries from the TRANSFAC database, with emphasis on sequences with experimentally verified binding capacity. Our search method includes position weighting of the matrices based on the information content of individual positions and calculates a relative matrix similarity. We show several examples suggesting that this matrix similarity is useful in estimating the functional potential of matrix matches and thus provides a valuable basis for designing appropriate experiments.


Assuntos
Sequência Consenso , Análise de Sequência/métodos , Software , Sequência de Bases , Sequência Conservada , Dados de Sequência Molecular
17.
J Comput Biol ; 1(3): 191-8, 1994.
Artigo em Inglês | MEDLINE | ID: mdl-8790464

RESUMO

DNA sequences that are involved in the control of gene expression in eukaryotes have been collected in conjunction with the proteins binding to and acting through them (TRANSFAC data set). To make these data accessible, the TRANSFAC retrieval program (TRP) has been developed as a database management system which is based upon the network model. This database model possesses particular advantages for data management of a complex structure. The aim of TRP is to provide an easily handled statistical basis for a computational approach to transcription control.


Assuntos
Sistemas de Gerenciamento de Base de Dados , Bases de Dados Factuais , Sequências Reguladoras de Ácido Nucleico , Fatores de Transcrição , Transcrição Gênica , Sequência de Aminoácidos , Sequência de Bases , Interface Usuário-Computador
18.
Nucleic Acids Res ; 21(7): 1655-64, 1993 Apr 11.
Artigo em Inglês | MEDLINE | ID: mdl-8479918

RESUMO

We present a method to determine the location and extent of protein binding regions in nucleic acids by computer-assisted analysis of sequence data. The program ConsIndex establishes a library of consensus descriptions based on sequence sets containing known regulatory elements. These defined consensus descriptions are used by the program ConsInspector to predict binding sites in new sequences. We show the programs to correctly determine the significant regions involved in transcriptional control of seven sequence elements. The internal profile of relative variability of individual nucleotide positions within these regions paralleled experimental profiles of biological significance. Consensus descriptions are determined by employing an anchored alignment scheme, the results of which are then evaluated by a novel method which is superior to cluster algorithms. The alignment procedure is able to include several closely related sequences without biasing the consensus description. Moreover, the algorithm detects additional elements on the basis of a moderate distance correlation and is capable of discriminating between real binding sites and false positive matches. The software is well suited to cope with the frequent phenomenon of optional elements present in a subset of functionally similar sequences, while taking maximal advantage of the existing sequence data base. Since it requires only a minimum of seven sequences for a single element, it is applicable to a wide range of binding sites.


Assuntos
Proteínas de Ligação a DNA/metabolismo , Sequências Reguladoras de Ácido Nucleico , Alinhamento de Sequência/métodos , Software , Fatores de Transcrição/metabolismo , Algoritmos , Sequência de Aminoácidos , Sequência de Bases , Sítios de Ligação , Sequência Consenso , Proteínas de Ligação a DNA/classificação , Modelos Genéticos , Dados de Sequência Molecular , Reconhecimento Automatizado de Padrão , Ligação Proteica , Fatores de Transcrição/classificação
19.
J Antimicrob Chemother ; 20(5): 729-34, 1987 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-3480885

RESUMO

The single dose pharmacokinetics of caffeine (220-230 mg per dose) were investigated in 12 healthy male volunteers before and during treatment with ofloxacin (200 mg bd), ciprofloxacin (250 mg bd) and enoxacin (400 mg bd) with a cross-over study design. None of the parameters: mean elimination half-life (T1/2el), Cmax, total body clearance (Cltot) and the volume of distribution (aVd) of caffeine were noticeably altered by administration of ofloxacin. Striking changes were observed, however, after administration of enoxacin: the T1/2el was prolonged by as much as 260%, the Cmax increased by 41%; the aVd was reduced by 20% and Cltot by 78% (mean values). Treatment with ciprofloxacin led to a prolongation of T1/2el by 15%, to a decrease of aVd by 25% and to a 33% decrease of Cltot. The results of this intra-individual comparison of caffeine pharmacokinetic data demonstrate that treatment with ciprofloxacin and enoxacin may have a significant inhibitory effect on caffeine elimination.


Assuntos
Cafeína/farmacocinética , Quinolinas/farmacologia , Adulto , Anti-Infecciosos/farmacologia , Cafeína/sangue , Cromatografia Líquida de Alta Pressão , Ciprofloxacina/farmacologia , Interações Medicamentosas , Enoxacino , Meia-Vida , Humanos , Masculino , Naftiridinas/farmacologia , Ofloxacino , Oxazinas/farmacologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...