Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 23
Filtrar
1.
Bioinformatics ; 16(9): 760-6, 2000 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-11108698

RESUMO

MOTIVATION: Database searching algorithms for proteins use scoring matrices based on average protein properties, and thus are dominated by globular proteins. However, since transmembrane regions of a protein are in a distinctly different environment than globular proteins, one would expect generalized substitution matrices to be inappropriate for transmembrane regions. RESULTS: We present the PHAT (predicted hydrophobic and transmembrane) matrix, which significantly outperforms generalized matrices and a previously published transmembrane matrix in searches with transmembrane queries. We conclude that a better matrix can be constructed by using background frequencies characteristic of the twilight zone, where low-scoring true positives have scores indistinguishable from high-scoring false positives, rather than the amino acid frequencies of the database. The PHAT matrix may help improve the accuracy of sequence alignments and evolutionary trees of membrane proteins.


Assuntos
Algoritmos , Biologia Computacional/métodos , Proteínas de Membrana/genética , Modelos Teóricos , Alinhamento de Sequência/métodos , Sequência de Aminoácidos/genética , Sequência Consenso/genética , Bases de Dados Factuais , Valor Preditivo dos Testes , Proteínas/química , Proteínas/genética , Reprodutibilidade dos Testes , Homologia de Sequência de Aminoácidos
2.
Electrophoresis ; 21(9): 1700-6, 2000 May.
Artigo em Inglês | MEDLINE | ID: mdl-10870957

RESUMO

The most highly conserved regions of proteins can be represented as blocks of aligned sequence segments, typically with multiple blocks for a given protein family. The Blocks Database World Wide Web (http://blocks.fhcrc.org) and e-mail (blocks@blocks. fhcrc.org) servers provide tools to search DNA and protein queries against the Blocks+ Database of multiple alignments. We describe features for detection of distant relationships using blocks. Blocks+ includes protein families from the PROSITE, Prints, Pfam-A, ProDom and Domo databases. Other features include searching Blocks+ with the BLIMPS and NCBI's IMPALA programs, sequence logos, phylogenetic trees, three-dimensional display of blocks on PDB structures, and a polymerase chain reaction (PCR) primer design strategy based on blocks.


Assuntos
Bases de Dados Factuais , Proteínas/análise , Homologia de Sequência de Aminoácidos , Sequência de Aminoácidos , Animais , Primers do DNA , Humanos , Dados de Sequência Molecular , Reação em Cadeia da Polimerase/métodos , Análise de Sequência de Proteína
4.
Genome Res ; 10(4): 543-6, 2000 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-10779495

RESUMO

A simple and general homology-based method for gene finding was applied to the 2.9-Mb Drosophila melanogaster Adh region, the target sequence of the Genome Annotation Assessment Project (GASP). Each strand of the entire sequence was used as query of the BLOCKS+ database of conserved regions of proteins. This led to functional assignments for more than one-third of the genes and two-thirds of the transposons. Considering the enormous size of the query, the fact that only two false-positive matches were reported emphasizes the high selectivity of protein family-based methods for gene finding. We used the search results to improve BLOCKS+ by identifying compositionally biased blocks. Our results confirm that protein family databases can be used effectively in automated sequence annotation efforts.


Assuntos
Bases de Dados Factuais , Drosophila melanogaster/genética , Genoma , Software , Álcool Desidrogenase/genética , Animais , Biologia Computacional , Drosophila melanogaster/enzimologia , Genes de Insetos/genética , Homologia de Sequência do Ácido Nucleico
5.
Nucleic Acids Res ; 28(1): 228-30, 2000 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-10592233

RESUMO

The Blocks Database WWW (http://blocks.fhcrc.org ) and Email (blocks@blocks.fhcrc.org ) servers provide tools to search DNA and protein queries against the Blocks+ Database of multiple alignments, which represent conserved protein regions. Blocks+ nearly doubles the number of protein families included in the database by adding families from the Pfam-A, ProDom and Domo databases to those from PROSITE and PRINTS. Other new features include improved Block Searcher statistics, searching with NCBI's IMPALA program and 3D display of blocks on PDB structures.


Assuntos
Bases de Dados Factuais , Proteínas/química , Sequência de Aminoácidos , Armazenamento e Recuperação da Informação , Internet , Dados de Sequência Molecular , Homologia de Sequência de Aminoácidos
6.
Bioinformatics ; 15(6): 471-9, 1999 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-10383472

RESUMO

MOTIVATION: As databanks grow, sequence classification and prediction of function by searching protein family databases becomes increasingly valuable. The original Blocks Database, which contains ungapped multiple alignments for families documented in Prosite, can be searched to classify new sequences. However, Prosite is incomplete, and families from other databases are now available to expand coverage of the Blocks Database. RESULTS: To take advantage of protein family information present in several existing compilations, we have used five databases to construct Blocks+, a unified database that is built on the PROTOMAT/BLOSUM scoring model and that can be searched using a single algorithm for consistent sequence classification. The LAMA blocks-versus-blocks searching program identifies overlapping protein families, making possible a non-redundant hierarchical compilation. Blocks+ consists of all blocks derived from PROSITE, blocks from Prints not present in PROSITE, blocks from Pfam-A not present in PROSITE or Prints, and so on for ProDom and Domo, for a total of 1995 protein families represented by 8909 blocks, doubling the coverage of the original Blocks Database. A challenge for any procedure aimed at non-redundancy is to retain related but distinct families while discarding those that are duplicates. We illustrate how using multiple compilations can minimize this potential problem by examining the SNF2 family of ATPases, which is detectably similar to distinct families of helicases and ATPases. AVAILABILITY: http://blocks.fhcrc.org/


Assuntos
Bases de Dados Factuais , Proteínas Nucleares , Proteínas/química , Alinhamento de Sequência/métodos , Adenosina Trifosfatases/química , Adenosina Trifosfatases/genética , Algoritmos , Sequência de Aminoácidos , Animais , Biologia Computacional , DNA Helicases , Proteínas de Ligação a DNA/química , Proteínas de Ligação a DNA/genética , Humanos , Dados de Sequência Molecular , Alinhamento de Sequência/estatística & dados numéricos , Homologia de Sequência de Aminoácidos , Software , Design de Software , Fatores de Transcrição/química , Fatores de Transcrição/genética
7.
Nucleic Acids Res ; 27(1): 226-8, 1999 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-9847186

RESUMO

Blocks are ungapped multiple sequence alignments representing conserved protein regions, and the Blocks Database consists of blocks from documented protein families. World Wide Web (http://www. blocks.fhcrc.org) and Email (blocks@blocks.fhcrc.org) servers provide tools for homology searching and for analyzing protein family relationships. New enhancements include a multiple alignment processor that extends the use of these tools to imported multiple alignments of families not present in the database and a PCR primer designer that implements a new strategy for gene isolation.


Assuntos
Bases de Dados Factuais , Proteínas/classificação , Alinhamento de Sequência , Software , Primers do DNA/genética , Armazenamento e Recuperação da Informação , Internet , Reação em Cadeia da Polimerase/métodos , Proteínas/química , Proteínas/genética , Homologia de Sequência de Aminoácidos
8.
Nucleic Acids Res ; 26(7): 1628-35, 1998 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-9512532

RESUMO

We describe a new primer design strategy for PCR amplification of unknown targets that are related to multiply-aligned protein sequences. Each primer consists of a short 3' degenerate core region and a longer 5' consensus clamp region. Only 3-4 highly conserved amino acid residues are necessary for design of the core, which is stabilized by the clamp during annealing to template molecules. During later rounds of amplification, the non-degenerate clamp permits stable annealing to product molecules. We demonstrate the practical utility of this hybrid primer method by detection of diverse reverse transcriptase-like genes in a human genome, and by detection of C5DNA methyltransferase homologs in various plant DNAs. In each case, amplified products were sufficiently pure to be cloned without gel fractionation. This COnsensus-DEgenerate Hybrid Oligonucleotide Primer (CODEHOP) strategy has been implemented as a computer program that is accessible over the World Wide Web (http://blocks.fhcrc.org/codehop.html) and is directly linked from the BlockMaker multiple sequence alignment site for hybrid primer prediction beginning with a set of related protein sequences.


Assuntos
Metilases de Modificação do DNA/química , Primers do DNA , Evolução Molecular , Filogenia , DNA Polimerase Dirigida por RNA/química , Sequência de Aminoácidos , Animais , Artrite Reumatoide/genética , Sequência de Bases , Códon , Redes de Comunicação de Computadores , Sequência Consenso , Sequência Conservada , Metilases de Modificação do DNA/genética , Humanos , Dados de Sequência Molecular , Hibridização de Ácido Nucleico , Reação em Cadeia da Polimerase/métodos , DNA Polimerase Dirigida por RNA/genética , Sarcoma de Kaposi/genética , Alinhamento de Sequência , Homologia de Sequência de Aminoácidos , Homologia de Sequência do Ácido Nucleico , Software
10.
Nucleic Acids Res ; 26(1): 309-12, 1998 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-9399861

RESUMO

The Blocks Database World Wide Web (http://www.blocks.fhcrc.org ) and Email (blocks@blocks.fhcrc.org) servers provide tools for the detection and analysis of protein homology based on alignment blocks representing conserved regions of proteins. During the past year, searching has been augmented by supplementation of the Blocks Database with blocks from the Prints Database, for a total of 4754 blocks from 1163 families. Blocks from both the Blocks and Prints Databases and blocks that are constructed from sequences submitted to Block Maker can be used for blocks-versus-blocks searching of these databases with LAMA, and for viewing logos and bootstrap trees. Sensitive searches of up-to-date protein sequence databanks are carried out via direct links to the MAST server using position-specific scoring matrices and to the BLAST and PSI-BLAST servers using consensus-embedded sequence queries. Utilizing the trypsin family to evaluate performance, we illustrate the superiority of blocks-based tools over expert pairwise searching or Hidden Markov Models.


Assuntos
Redes de Comunicação de Computadores , Bases de Dados Factuais , Proteínas/química , Homologia de Sequência de Aminoácidos , Sequência de Aminoácidos , Animais , Sequência Conservada , Humanos
11.
Protein Sci ; 6(3): 698-705, 1997 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-9070452

RESUMO

We describe a new strategy for utilizing multiple sequence alignment information to detect distant relationships in searches of sequence databases. A single sequence representing a protein family is enriched by replacing conserved regions with position-specific scoring matrices (PSSMs) or consensus residues derived from multiple alignments of family members. In comprehensive tests of these and other family representations, PSSM-embedded queries produced the best results overall when used with a special version of the Smith-Waterman searching algorithm. Moreover, embedding consensus residues instead of PSSMs improved performance with readily available single sequence query searching programs, such as BLAST and FASTA. Embedding PSSMs or consensus residues into a representative sequence improves searching performance by extracting multiple alignment information from motif regions while retaining single sequence information where alignment is uncertain.


Assuntos
Proteínas/química , Alinhamento de Sequência , Algoritmos , Sequência de Aminoácidos , Sequência Consenso , Estudos de Avaliação como Assunto , Dados de Sequência Molecular
12.
Nucleic Acids Res ; 25(1): 222-5, 1997 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-9016540

RESUMO

The Blocks Database contains multiple alignments of conserved regions in protein families which can be searched by e-mail (blocks@blocks.fhcrc.org) and World Wide Web (http://blocks.fhcrc.org/ ) servers to classify protein and nucleotide sequences. Recent enhancements to the servers include: (i) improved calculation of position-specific scoring matrices from blocks; (ii) availability of the Prints protein fingerprint database for searching in Blocks format; (iii) a representative sequence biased towards the Blocks of a protein family; (iv) a tree constructed from the Blocks of a protein family; (v) links to related World Wide Web pages for a family; and (vi) the new Local Alignment of Multiple Alignments (LAMA) method to search a block against a database of blocks.


Assuntos
Bases de Dados Factuais , Proteínas/genética , Alinhamento de Sequência/métodos , Sequência de Aminoácidos , Animais , Sequência de Bases , Redes de Comunicação de Computadores , Humanos , Dados de Sequência Molecular
13.
Ann Intern Med ; 124(11): 970-9, 1996 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-8624064

RESUMO

OBJECTIVE: To determine whether increasing age is associated with an increased risk for bleeding during warfarin treatment. DESIGN: Combined retrospective and prospective cohort studies. SETTING: 6 anticoagulation clinics. PATIENTS: 2376 patients receiving warfarin for various indications. MEASUREMENTS: Bleeding events categorized as minor (resulting in no costs or consequences), serious (requiring testing or treatment), life-threatening, or fatal. RESULTS: 812 first bleeding events (4 fatal, 33 life-threatening, 222 serious, and 553 minor) occurred during 3702 patient-years. Age was inversely related to the mean warfarin dose and dose-adjusted prothrombin time ratio. The unadjusted incidence of minor bleeding complications did not vary according to age group: 18.0 per 100 patient-years for patients younger than 50 years of age, 21.5 for patients 50 to 59 years of age, 24.0 for patients 60 to 69 years of age; 23.5 for patients 70 to 79 years of age, and 16.3 for patient 80 years of age and older. The unadjusted incidence of serious bleeding complications also did not vary according to age group: 9.3 per 100 patient-years for patients younger than 50 years of age, 7.1 for patients 50 to 59 years of age, 6.6 for patients 60 to 69 years of age, 5.1 for patients 70 to 79 years of age, and 4.4 for patients 80 years of age and older. The unadjusted incidence of life-threatening or fatal complications combined was significantly higher among the oldest patients: 0.75 per 100 patient-years for patients younger than 50 years of age, 0.97 for patients 50 to 59 years of age, 1.10 for patients 60 to 69 years of age, 0.68 for patients 70 to 79 years of age, and 3.38 for patients 80 years of age and older. Patients 80 years of age and older had a relative risk of 4.5 (95% CI, 1.3 to 15.6) compared with patients younger than 50 years of age. After adjustment for the intensity of anticoagulation therapy and the deviation in the prothrombin time ratio using Cox and Poisson regression, age was not generally associated with the occurrence of bleeding; relative risk estimates ranged from 0.99 to 1.03 per year of age (lower-bound 95% CI, 0.97 to 1.01; upper-bound 95% CI, 1.00 to 1.09). The single exception was life-threatening and fatal complications in patients 80 years of age or older (relative risk, 4.6 [CI, 1.2 to 18.1]). CONCLUSIONS: Age did not appear to be an important determinant of risk for bleeding in patients receiving warfarin, with the possible exception of age 80 years or older. The intensity of anticoagulation therapy and the deviation in the prothrombin time ratio were much stronger predictors of risk for bleeding.


Assuntos
Anticoagulantes/efeitos adversos , Hemorragia/induzido quimicamente , Varfarina/efeitos adversos , Fatores Etários , Idoso , Idoso de 80 Anos ou mais , Anticoagulantes/administração & dosagem , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Distribuição de Poisson , Estudos Prospectivos , Tempo de Protrombina , Análise de Regressão , Estudos Retrospectivos , Fatores de Risco , Varfarina/administração & dosagem
14.
Comput Appl Biosci ; 12(2): 135-43, 1996 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-8744776

RESUMO

Each column of amino acids in a multiple alignment of protein sequences can be represented as a vector of 20 amino acid counts. For alignment and searching applications, the count vector is an imperfect representation of a position, because the observed sequences are an incomplete sample of the full set of related sequences. One general solution to this problem is to model unobserved sequences by adding artificial 'pseudo-counts' to the observed counts. We introduce a simple method for computing pseudo-counts that combines the diversity observed in each alignment position with amino acid substitution probabilities. In extensive empirical tests, this position-based method out-performed other pseudo-count methods and was a substantial improvement over the traditional average score method used for constructing profiles.


Assuntos
Alinhamento de Sequência/métodos , Sequência de Aminoácidos , Computadores , Bases de Dados Factuais , Estudos de Avaliação como Assunto , Razão de Chances , Probabilidade , Proteínas/química , Proteínas/genética , Alinhamento de Sequência/estatística & dados numéricos
15.
Nucleic Acids Res ; 24(1): 197-200, 1996 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-8594578

RESUMO

The Blocks Database contains multiple alignments of conserved regions in protein families. The database can be searched by e-mail and World Wide Web(WWW) servers (http://blocks.fhcrc.org/help) to classify protein and nucleotide sequences.


Assuntos
Bases de Dados Factuais , Proteínas/química , Sequência de Aminoácidos , Redes de Comunicação de Computadores , Armazenamento e Recuperação da Informação , Dados de Sequência Molecular , Proteínas/genética , Homologia de Sequência de Aminoácidos
16.
Methods Enzymol ; 266: 88-105, 1996.
Artigo em Inglês | MEDLINE | ID: mdl-8743679

RESUMO

Protein blocks consist of multiply aligned sequence segments without gaps that represent the most highly conserved regions of protein families. A database of blocks has been constructed by successive application of the fully automated PROTOMAT system to lists of protein family members obtained from Prosite documentation. Currently, Blocks 8.0 based on protein families documented in Prosite 12 consists of 2884 blocks representing 770 families. Searches of the Blocks Database are carried out using protein or DNA sequence queries, and results are returned with measures of significance for both single and multiple block hits. The databse has also proved useful for derivation of amino acid substitution matrices (the Blosum series) and other sets of parameters. WWW and E-mail servers provide access to the database and associated functions, including a block maker for sequences provided by the user.


Assuntos
Sequência de Aminoácidos , Sequência de Bases , DNA/química , Bases de Dados Factuais , Oxirredutases , Proteínas/química , Homologia de Sequência de Aminoácidos , Redes de Comunicação de Computadores , Sequência Conservada , Glutarredoxinas , Dados de Sequência Molecular , Proteínas/genética , Saccharomyces cerevisiae , Software
17.
Gene ; 163(2): GC17-26, 1995 Oct 03.
Artigo em Inglês | MEDLINE | ID: mdl-7590261

RESUMO

Protein blocks consist of multiply aligned sequence segments that correspond to the most highly conserved regions of protein families. Typically, a set of related proteins has more than one region in common and their relationship can be represented as a series of ungapped blocks separated by unaligned regions. Blockmaker is an automated system available by electronic mail (blockmaker@howard.fhcrc.org) and the World Wide Web (http://www.blocks.fhcrc.org4) that finds blocks in a group of related protein sequences submitted by the user. It adapts and extends existing algorithms to make them useful to biologists looking for conserved regions in a group of related proteins sequences. Two sets of blocks are returned, one in which candidate blocks are detected using the MOTIF algorithm and the other using a Gibbs sampler algorithm that has been adapted for full automation. This use of two block-finding methods based on completely different principles provides a 'reality check,' whereby a block detected by both methods is considered to be correct. Resulting blocks can be displayed using the information-based 'sequence logo' method, adapted to incorporate sequence weights, which provides an intuitive visual description of both the residue and the conservation information at each position. Blocks generated by this system are useful in diverse applications, such as searching databases and designing degenerate PCR primers. As an example, blocks made from amino acid sequences related to Caenorhabditis elegans Tc1 transposase were used to search GenBank, revealing that several fish and amphibian genomic sequences harbor previously unreported Tc1 homologs.


Assuntos
Sequência de Aminoácidos , Gráficos por Computador , Bases de Dados Factuais , Design de Software , Transposases , Algoritmos , Animais , Caenorhabditis elegans/enzimologia , Proteínas de Ligação a DNA/química , Armazenamento e Recuperação da Informação , Dados de Sequência Molecular , Nucleotidiltransferases/química , Proteínas/química , Alinhamento de Sequência
18.
J Mol Biol ; 243(4): 574-8, 1994 Nov 04.
Artigo em Inglês | MEDLINE | ID: mdl-7966282

RESUMO

Sequence weighting methods have been used to reduce redundancy and emphasize diversity in multiple sequence alignment and searching applications. Each of these methods is based on a notion of distance between a sequence and an ancestral or generalized sequence. We describe a different approach, which bases weights on the diversity observed at each position in the alignment, rather than on a sequence distance measure. These position-based weights make minimal assumptions, are simple to compute, and perform well in comprehensive evaluations.


Assuntos
Alinhamento de Sequência , Sequência de Aminoácidos , Sequência de Bases , Simulação por Computador , Bases de Dados Factuais , Variação Genética , Dados de Sequência Molecular
19.
J Gen Intern Med ; 9(3): 131-9, 1994 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-8195911

RESUMO

OBJECTIVE: To evaluate a computerized scheduling model that employs nonlinear optimization to recommend optimal follow-up intervals for patients taking warfarin. DESIGN: Randomized trial. SETTING: 5 anticoagulation clinics. PATIENTS/PARTICIPANTS: 620 patients expected to receive warfarin for > or = 6 weeks. INTERVENTIONS: Computer-generated recommendations for scheduling the next visit were presented to or withheld from practitioners. MEASUREMENTS AND MAIN RESULTS: The main outcome measures were the follow-up interval scheduled by the provider, the interval at which the patient actually returned to clinic, and the quality of anticoagulation control (computed as the absolute difference between the measured and target prothrombin times [PTRs] or international normalized ratios [INRs]). Follow-up intervals scheduled for the patients whose practitioners received computer-generated recommendations were significantly longer than those for control patients (mean, 4.4 vs 3.5 weeks, p < 0.001), despite the fact that the practitioners modified the suggested return interval by > 1 week on 40% of the visits. The interval at which the intervention group actually returned to clinic was also longer (mean, 4.4 vs 4.1 weeks, p < 0.05), even though the control patients tended to return at longer intervals than were scheduled by their practitioners. Control of anticoagulation was nearly the same among experimental and control patients. Life-threatening complications occurred in the care of three experimental patients and one control patient, while other serious complications occurred in the care of 16 experimental patients and 17 control patients. CONCLUSIONS: Recommendations based on nonlinear optimization prompted clinicians to schedule less frequent follow-up for patients taking warfarin, with no deterioration in anticoagulation control. This approach to scheduling can potentially reduce utilization while maintaining quality of care for patients who require long-term monitoring.


Assuntos
Agendamento de Consultas , Quimioterapia Assistida por Computador , Monitorização Fisiológica/métodos , Varfarina/uso terapêutico , Continuidade da Assistência ao Paciente/normas , Feminino , Seguimentos , Humanos , Masculino , Pessoa de Meia-Idade , Tempo de Protrombina
20.
Genomics ; 19(1): 97-107, 1994 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-8188249

RESUMO

The most highly conserved regions of proteins can be represented as "blocks" of locally aligned sequence segments. Previously, an automated system was introduced to generate a database of blocks that is searched for local similarities using a sequence query. Here, we describe a method for searching this database that can also reveal significant global similarities. Local and global alignments are scored independently, so they can be used in concert to infer homology. A set of 7082 diverse sequences not represented in the database provided queries for testing this approach. The resulting distributions of scores led to guidelines for interpretation of search data and to the classification of 289 uncatalogued sequences into known groups. Thirty-eight of these relationships appear to be new discoveries. We also show how searching a database of blocks can be used to detect repeated domains and to find distinct cross-family relationships that were missed in searches of sequence databases.


Assuntos
Bases de Dados Factuais , Proteínas/classificação , Alinhamento de Sequência , Homologia de Sequência de Aminoácidos , Animais , DNA Helicases/química , Mamíferos/genética , Proteínas/química , Sequências Repetitivas de Ácido Nucleico , Saccharomyces cerevisiae/genética , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...