Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 57
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Bioinformatics ; 18(11): 1500-7, 2002 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-12424122

RESUMO

MOTIVATION: No general theory guides the selection of gap penalties for local sequence alignment. We empirically determined the most effective gap penalties for protein sequence similarity searches with substitution matrices over a range of target evolutionary distances from 20 to 200 Point Accepted Mutations (PAMs). RESULTS: We embedded real and simulated homologs of protein sequences into a database and searched the database to determine the gap penalties that produced the best statistical significance for the distant homologs. The most effective penalty for the first residue in a gap (q+r) changes as a function of evolutionary distance, while the gap extension penalty for additional residues (r) does not. For these data, the optimal gap penalties for a given matrix scaled in 1/3 bit units (e.g. BLOSUM50, PAM200) are q=25-0.1 * (target PAM distance), r=5. Our results provide an empirical basis for selection of gap penalties and demonstrate how optimal gap penalties behave as a function of the target evolutionary distance of the substitution matrix. These gap penalties can improve expectation values by at least one order of magnitude when searching with short sequences, and improve the alignment of proteins containing short sequences repeated in tandem.


Assuntos
Bases de Dados de Proteínas , Evolução Molecular , Variação Genética/genética , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína , Sequência de Aminoácidos , Simulação por Computador , Dados de Sequência Molecular , Mutação Puntual/genética , Controle de Qualidade , Deleção de Sequência/genética , Homologia de Sequência de Aminoácidos
3.
Methods Mol Biol ; 132: 185-219, 2000.
Artigo em Inglês | MEDLINE | ID: mdl-10547837

RESUMO

The FASTA3 and FASTA2 packages provide a flexible set of sequence-comparison programs that are particularly valuable because of their accurate statistical estimates and high-quality alignments. Traditionally, sequence similarity searches have sought to ask one question: "Is my query sequence homologous to anything in the database?" Both FASTA and BLAST can provide reliable answers to this question with their statistical estimates; if the expectation value E is < 0.001-0.01 and you are not doing hundreds of searches a day, the answer is probably yes. In general, the most effective search strategies follow these rules: 1. Whenever possible, compare at the amino acid level, rather than the nucleotide level. Search first with protein sequences (blastp, fasta3, and ssearch3), then with translated DNA sequences (fastx, blastx), and only at the DNA level as a last resort (Table 5). 2. Search the smallest database that is likely to contain the sequence of interest (but it must contain many unrelated sequences for accurate statistical estimates). 3. Use sequence statistics, rather than percent identity or percent similarity, as your primary criterion for sequence homology. 4. Check that the statistics are likely to be accurate by looking for the highest-scoring unrelated sequence, using prss3 to confirm the expectation, and searching with shuffled copies of the query sequence [randseq, searches with shuffled sequences should have E approx 1.0]. 5. Consider searches with different gap penalties and other scoring matrices. Searches with long query sequences against full-length sequence libraries will not change dramatically when BLOSUM62 is used instead of BLOSUM50 (20), or a gap penalty of -14/-2 is used in place of -12/-2. However, shallower or more stringent scoring matrices are more effective at uncovering relationships in partial sequences (3,18), and they can be used to sharpen dramatically the scope of the similarity search. However, as illustrated in the last section, the E value is only the first step in characterizing a sequence relationship. Once one has confidence that the sequences are homologous, one should look at the sequence alignments and percent identities, particularly when searching with lower quality sequences. When sequence alignments are very short, the alignment should become more significant when a shallower scoring matrix is used, e.g., BLOSUM62 rather than BLOSUM50 (remember to change the gap penalties). Homology can be reliably inferred from statistically significant similarity. Whereas homology implies common three-dimensional structure, homology need not imply common function. Orthologous sequences usually have similar functions, but paralogous sequences often acquire very different functional roles. Motif databases, such as PROSITE (21), can provide evidence for the conservation of critical functional residues. However, motif identity in the absence of overall sequence similarity is not a reliable indicator of homology.


Assuntos
Sistemas de Gerenciamento de Base de Dados , Armazenamento e Recuperação da Informação , Alinhamento de Sequência/métodos , Sequência de Aminoácidos , Evolução Molecular , Dados de Sequência Molecular , Homologia de Sequência de Aminoácidos
4.
J Mol Biol ; 291(4): 977-95, 1999 Aug 27.
Artigo em Inglês | MEDLINE | ID: mdl-10452901

RESUMO

The relationship between sequence similarity and structural similarity has been examined in 36 protein families with five or more diverse members whose structures are known. The structural similarity within a family (as determined with the DALI structure comparison program) is linearly related to sequence similarity (as determined by a Smith-Waterman search of the protein sequences in the structure database). The correlation between structural similarity and sequence similarity is very high; 18 of the 36 families had linear correlation coefficients r>/=0.878, and only nine had correlation coefficients r

Assuntos
Evolução Molecular , Proteínas/química , Proteínas/genética , Sequência de Aminoácidos , Animais , Mutação , Conformação Proteica , Dobramento de Proteína , Proteínas/classificação , Análise de Regressão , Homologia de Sequência de Aminoácidos
5.
Mol Biol Evol ; 16(6): 806-16, 1999 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-10368958

RESUMO

We have developed a phylogenetic tree reconstruction method that detects and reports multiple topologically distant low-cost solutions. Our method is a generalization of the neighbor-joining method of Saitou and Nei and affords a more thorough sampling of the solution space by keeping track of multiple partial solutions during its execution. The scope of the solution space sampling is controlled by a pair of user-specified parameters--the total number of alternate solutions and the number of alternate solutions that are randomly selected--effecting a smooth trade-off between run time and solution quality and diversity. This method can discover topologically distinct low-cost solutions. In tests on biological and synthetic data sets using either the least-squares distance or minimum-evolution criterion, the method consistently performed as well as, or better than, both the neighbor-joining heuristic and the PHYLIP implementation of the Fitch-Margoliash distance measure. In addition, the method identified alternative tree topologies with costs within 1% or 2% of the best, but with topological distances of 9 or more partitions from the best solution (16 taxa); with 32 taxa, topologies were obtained 17 (least-squares) and 22 (minimum-evolution) partitions from the best topology when 200 partial solutions were retained. Thus, the method can find lower-cost tree topologies and near-best tree topologies that are significantly different from the best topology.


Assuntos
Técnicas Genéticas , Filogenia , Algoritmos , Evolução Molecular , Modelos Genéticos
6.
Genome Res ; 9(4): 373-82, 1999 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-10207159

RESUMO

We have developed a rapid visual method for identifying novel members of gene families. Starting with an evolutionary tree, 20-50 protein query sequences for a gene family are selected from different branches of the tree. These query sequences are used to search the GenBank and expressed sequence tag (EST) DNA databases and their nightly updates using the tfastx3 or tfasty3 programs. The results of all 20-50 searches are collated and resorted to highlight EST or genomic sequences that share significant similarity with the query sequences. The statistical significance of each DNA/protein alignment is plotted, highlighting the portion of the query sequence that is present in the database sequence and the percent identity in the aligned region. The collated results for database sequences are linked using the WWW to the underlying scores and alignments; these links can also be used to perform additional searches to characterize the novel sequence further. With traditional "deep" scoring matrices (BLOSUM50) one can search for previously unrecognized families of large protein superfamilies. Alternatively, by using query sequences and EST libraries from the same species (e. g., human or mouse) together with "shallow" scoring matrices and filters that remove high-identity sequences, one can highlight new paralogs of previously described subfamilies. Using query sequences from the glutathione transferase superfamily, we identified two novel mammalian glutathione transferase families that were recognized previously only in plants. Using query sequences from known mammalian glutathione transferase subfamilies, we identified new candidate paralogs from the mouse class-mu, class-pi, and class-theta families.


Assuntos
Apresentação de Dados , Genes , Técnicas Genéticas , Família Multigênica/genética , Software , Sequência de Aminoácidos , Animais , Evolução Molecular , Etiquetas de Sequências Expressas , Glutationa Transferase/genética , Humanos , Camundongos , Dados de Sequência Molecular , Homologia de Sequência de Aminoácidos , Interface Usuário-Computador
7.
Arch Biochem Biophys ; 361(1): 85-93, 1999 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-9882431

RESUMO

The sequence and exon-intron structure of the human class mu GSTM3 glutathione transferase gene and its orientation with respect to the remainder of the human class mu GSTM gene cluster were determined. The GSTM3 gene is 2847 bp long and is thus considerably shorter than the other class mu genes in the cluster, which range in size from 5325 to 7212 bp. Outside the protein-coding region, the GSTM3 gene does not share significant sequence similarity with other class mu glutathione transferase genes. Identification of overlapping cosmid clones that span the region between GSTM5, the next nearest glutathione transferase gene, and GSTM3 showed that the two genes are about 20,000 bp apart. PCR primers developed from sequences 3'-downstream from the GSTM5 gene were used to identify clones containing the GSTM3 gene. Amplification with these primers showed that the orientation of the GSTM3 gene is 5'-GSTM5-3'-3'-GSTM3-5'. Long-range PCR reactions confirmed this orientation both in the GSTM-YAC2 YAC clone, which contains the five class mu glutathione transferase genes on chromosome 1, and in human DNA. This tail-to-tail orientation is consistent with an evolutionary model of class mu glutathione transferase divergence from a pair of tail-to-tail "M1-like" and "M3-like" class mu glutathione transferase genes that was present at the mammalian radiation to the current organization of multiple head-to-tail M1-like genes tail-to-tail with a single M3-like gene with distinct structural properties and expression patterns.


Assuntos
Glutationa Transferase/genética , Família Multigênica/genética , Mapeamento Físico do Cromossomo , Sequência de Bases , Cromossomos Humanos Par 1/genética , Cosmídeos , Ligação Genética , Glutationa Transferase/química , Humanos , Dados de Sequência Molecular , Filogenia , Mapeamento por Restrição , Transcrição Gênica
8.
J Biol Chem ; 273(38): 24396-405, 1998 Sep 18.
Artigo em Inglês | MEDLINE | ID: mdl-9733729

RESUMO

Microcystin-affinity chromatography was used to purify 15 protein phosphatase 1 (PP1)-binding proteins from the myofibrillar fraction of rabbit skeletal muscle. To reduce the time and amount of material required to identify these proteins, proteome analysis by mixed peptide sequencing was developed. Proteins are resolved by SDS-polyacrylamide gel electrophoresis, electroblotted to polyvinylidene fluoride membrane, and stained. Bands are sliced from the membrane, cleaved briefly with CnBr, and applied without further purification to an automated Edman sequencer. The mixed peptide sequences generated are sorted and matched against the GenBank using two new programs, FASTF and TFASTF. This technology offers a simple alternative to mass spectrometry for the subpicomolar identification of proteins in polyacrylamide gels. Using this technology, all 15 proteins recovered in PP-1C affinity chromatography were sequenced. One of the proteins, PP-1bp55, was homologous to human myosin phosphatase, MYPT2. A second, PP-1bp80, identified in the EST data bases, contained a putative PP-1C binding site and a nucleotide binding motif. Further affinity purification over ATP-Sepharose isolated PP-1bp80 in a quaternary complex with PP-1C and two other proteins, PP-1bp29 and human p20. Recombinant PP-1bp80 also bound PP-1C and suppressed its activity toward a variety of substrates, suggesting that the protein is a novel regulatory subunit of PP-1.


Assuntos
Proteínas de Transporte/química , Bases de Dados Factuais , Músculo Esquelético/metabolismo , Fosfoproteínas Fosfatases/metabolismo , Proteínas de Xenopus , Sequência de Aminoácidos , Animais , Calmodulina/química , Proteínas de Transporte/isolamento & purificação , Proteínas de Transporte/metabolismo , Galinhas , Cromatografia de Afinidade , Humanos , Substâncias Macromoleculares , Espectrometria de Massas , Proteínas de Membrana/química , Camundongos , Dados de Sequência Molecular , Músculo Esquelético/química , Miofibrilas/química , Miofibrilas/metabolismo , Fragmentos de Peptídeos/química , Fosfoproteínas Fosfatases/química , Presenilinas , Proteína Fosfatase 1 , Coelhos , Proteínas Recombinantes/química , Proteínas Recombinantes/isolamento & purificação , Proteínas Recombinantes/metabolismo , Alinhamento de Sequência , Homologia de Sequência de Aminoácidos
9.
J Biol Chem ; 273(6): 3517-27, 1998 Feb 06.
Artigo em Inglês | MEDLINE | ID: mdl-9452477

RESUMO

A partial physical map has been constructed of the human class Mu glutathione S-transferase genes on chromosome 1p13.3. The glutathione S-transferase genes in this cluster are spaced about 20 kilobase pairs (kb) apart, and arranged as 5'-GSTM4-GSTM2-GSTM1-GSTM5-3'. This map has been used to localize the end points of the polymorphic GSTM1 deletion. The left repeated region is 5 kb downstream from the 3'-end of the GSTM2 gene and 5 kb upstream from the beginning of the GSTM1 gene; the right repeated region is 5 kb downstream from the 3'-end of the GSTM1 and 10 kb upstream from the 5'-end of the GSTM5 gene. The GSTM1-0 deletion produces a novel 7.4-kb HindIII fragment with the loss of 10.3- and 11.4-kb HindIII fragments. The same novel fragment was seen in 13 unrelated individuals (20 null alleles), suggesting that most GSTM1-0 deletions involve recombinations between the same two regions. We have cloned and sequenced the deletion junction that is produced at the GSTM1-null locus; the 5'- and 3'-flanking regions are more than 99% identical to each other and to the deletion junction sequence over 2.3 kb. Because of the high sequence identity between the left repeat, right repeat, and deletion junction regions, the crossing over cannot be localized within the 2.3-kb region. The 2.3-kb repeated region contains a reverse class IV Alu repetitive element near one end of the repeat.


Assuntos
Deleção de Genes , Glutationa Transferase/genética , Família Multigênica , Sequência de Bases , Mapeamento Cromossômico , Cromossomos Artificiais de Levedura , Cromossomos Humanos Par 1 , Clonagem Molecular , DNA , Vetores Genéticos , Heterozigoto , Homozigoto , Humanos , Dados de Sequência Molecular , Proteínas Recombinantes de Fusão/metabolismo , Recombinação Genética , Mapeamento por Restrição , Homologia de Sequência do Ácido Nucleico
10.
J Mol Biol ; 276(1): 71-84, 1998 Feb 13.
Artigo em Inglês | MEDLINE | ID: mdl-9514730

RESUMO

The FASTA package of sequence comparison programs has been modified to provide accurate statistical estimates for local sequence similarity scores with gaps. These estimates are derived using the extreme value distribution from the mean and variance of the local similarity scores of unrelated sequences after the scores have been corrected for the expected effect of library sequence length. This approach allows accurate estimates to be calculated for both FASTA and Smith-Waterman similarity scores for protein/protein, DNA/DNA, and protein/translated-DNA comparisons. The accuracy of the statistical estimates is summarized for 54 protein families using FASTA and Smith-Waterman scores. Probability estimates calculated from the distribution of similarity scores are generally conservative, as are probabilities calculated using the Altschul-Gish lambda, kappa, and eta parameters. The performance of several alternative methods for correcting similarity scores for library-sequence length was evaluated using 54 protein superfamilies from the PIR39 database and 110 protein families from the Prosite/SwissProt rel. 34 database. Both regression-scaled and Altschul-Gish scaled scores perform significantly better than unscaled Smith-Waterman or FASTA similarity scores. When the Prosite/ SwissProt test set is used, regression-scaled scores perform slightly better; when the PIR database is used, Altschul-Gish scaled scores perform best. Thus, length-corrected similarity scores improve the sensitivity of database searches. Statistical parameters that are derived from the distribution of similarity scores from the thousands of unrelated sequences typically encountered in a database search provide accurate estimates of statistical significance that can be used to infer sequence homology.


Assuntos
Homologia de Sequência , Software , Animais , Bases de Dados Factuais , Estudos de Avaliação como Assunto , Humanos , Camundongos , Análise de Regressão , Homologia de Sequência de Aminoácidos , Homologia de Sequência do Ácido Nucleico
11.
Genomics ; 46(1): 24-36, 1997 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-9403055

RESUMO

The FASTA package of sequence comparison programs has been expanded to include FASTX and FASTY, which compare a DNA sequence to a protein sequence database, translating the DNA sequence in three frames and aligning the translated DNA sequence to each sequence in the protein database, allowing gaps and frameshifts. Also new are TFASTX and TFASTY, which compare a protein sequence to a DNA sequence database, translating each sequence in the DNA database in six frames and scoring alignments with gaps and frameshifts. FASTX and TFASTX allow only frameshifts between codons, while FASTY and TFASTY allow substitutions or frameshifts within a codon. We examined the performance of FASTX and FASTY using different gap-opening, gap-extension, frameshift, and nucleotide substitution penalties. In general, FASTX and FASTY perform equivalently when query sequences contain 0-10% errors. We also evaluated the statistical estimates reported by FASTX and FASTY. These estimates are quite accurate, except when an out-of-frame translation produces a low-complexity protein sequence. We used FASTX to scan the Mycoplasma genitalium, Haemophilus influenzae, and Methanococcus jannaschii genomes for unidentified or misidentified protein-coding genes. We found at least 9 new protein-coding genes in the three genomes and at least 35 genes with potentially incorrect boundaries.


Assuntos
Sequência de Aminoácidos , Sequência de Bases , DNA/genética , Proteínas/genética , Alinhamento de Sequência/métodos , Software , Animais , Bases de Dados Factuais , Genes/genética , Genes Bacterianos/genética , Camundongos , Dados de Sequência Molecular , Fases de Leitura Aberta/genética
12.
J Comput Biol ; 4(3): 339-49, 1997.
Artigo em Inglês | MEDLINE | ID: mdl-9278064

RESUMO

We develop several algorithms for the problem of aligning DNA sequence with a protein sequence. Our methods account for frameshift errors, but not for introns in the DNA sequence. Thus, they are particularly appropriate for comparing a cDNA sequence that suffers from sequencing errors with an amino acid sequence or a protein sequence database. We describe algorithms for computing optimal alignments for several definitions of DNA-protein alignment, verify sufficient conditions for equivalence of certain definitions, describe techniques for efficient implementation, and discuss experience with these ideas in a new release of the FASTA suite of database-searching programs.


Assuntos
Algoritmos , DNA/química , Proteínas/química , Alinhamento de Sequência/métodos , Bases de Dados Factuais , Análise de Sequência de DNA , Software
14.
Methods Enzymol ; 266: 227-58, 1996.
Artigo em Inglês | MEDLINE | ID: mdl-8743688

RESUMO

Although there are several different comparison programs available (e.g., BLASTP, FASTA, SSEARCH, and BLITZ) that can be used with different scoring systems (e.g., PAM120, PAM250, BLOSUM50, BLOSUM62) and different databases (e.g., PIR, SWISS-PROT, GenPept), the following search protocol should identify homologous sequences whenever they can be found. 1. Always compare protein sequences if the genes encode proteins. Protein sequence comparison will typically double the evolutionary lookback time over DNA sequence comparison. 2. Search several sequence databases using a rapid sequence comparison program (e.g., BLASTP or FASTA, ktup = 2). Well-curated databases like PIR or SWISS-PROT tend to have fewer redundant sequences, which improves the statistical significance of a match, but they are less comprehensive and up-to-date than GenPept. 3. If there is good agreement between the distribution of scores and the theoretical distribution, and the alignments do not include "simple sequence" domains, accept sequences with FASTA E() values or BLASTP P() values below 0.02 as homologous. 4. If no library sequences are found with E values below 0.02, perform additional searches with FASTA, ktup = 1, or SSEARCH. If library sequences with E values less than 0.02 are found, the sequences are probably homologous, unless a low-complexity domain is aligned. However, sequences with similarity scores from 0.02 to 10.0 may be homologous as well. To characterize these more distantly related sequences, select "marginal" library sequences and use them to search the databases. Additional family members should have E values less than 0.05. 5. Homologous sequences share a common ancestor, and thus a common protein fold. Depending on the evolutionary distance and divergence path, two or more homologous sequences may have very few absolutely conserved residues. However, if homology has been inferred between A and B, between B and C, and between C and D, A and D must be homologous, even if they share no significant similarity. 6. Sequences with marginal E values should also be tested using the PRSS program. Compare the query and library sequences using at least 200 (and preferably 1000) shuffles. Shuffles using a window (-w) of 10-20 are more stringent than a uniform shuffle. Use the E value after 1000 shuffles to confirm an inference of homology. 7. Homologous sequences are usually similar over an entire sequence or domain, typically sharing 20-25% or greater identity for more than 200 residues. Matches that are more than 50% identical in a 20- to 40-amino acid region occur frequently by chance and do not indicate homology. By following these steps, one will very rarely assert that two sequences are homologous when in fact they are not. However, these criteria are stringent; distantly related homologous sequences may fail to be detected because their similarity is not statistically significant. These tests are biased toward missing some distantly related sequences to avoid the possibility of misidentifying unrelated ones. In most database searches, the ratio of related to unrelated sequences is more than 4000:1 (e.g., 10 related and 40,000 unrelated sequences). Thus, one is more likely to mistakenly identify two sequences as related than to overlook a genuine relationship, and our conservative evaluation criteria reflect that bias.


Assuntos
Sequência de Aminoácidos , Bases de Dados Factuais , Proteínas/química , Proteínas/genética , Homologia de Sequência de Aminoácidos , Software , Animais , Calmodulina/genética , Drosophila , Glutationa Transferase/genética , Humanos , Isoenzimas/genética , Camundongos , Dados de Sequência Molecular , Fator 1 de Elongação de Peptídeos , Fatores de Alongamento de Peptídeos/genética , Probabilidade , Ratos , Análise de Regressão , Sensibilidade e Especificidade
15.
Protein Sci ; 4(6): 1145-60, 1995 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-7549879

RESUMO

We have compared commonly used sequence comparison algorithms, scoring matrices, and gap penalties using a method that identifies statistically significant differences in performance. Search sensitivity with either the Smith-Waterman algorithm or FASTA is significantly improved by using modern scoring matrices, such as BLOSUM45-55, and optimized gap penalties instead of the conventional PAM250 matrix. More dramatic improvement can be obtained by scaling similarity scores by the logarithm of the length of the library sequence (In()-scaling). With the best modern scoring matrix (BLOSUM55 or JO93) and optimal gap penalties (-12 for the first residue in the gap and -2 for additional residues), Smith-Waterman and FASTA performed significantly better than BLASTP. With In()-scaling and optimal scoring matrices (BLOSUM45 or Gonnet92) and gap penalties (-12, -1), the rigorous Smith-Waterman algorithm performs better than either BLASTP and FASTA, although with the Gonnet92 matrix the difference with FASTA was not significant. Ln()-scaling performed better than normalization based on other simple functions of library sequence length. Ln()-scaling also performed better than scores based on normalized variance, but the differences were not statistically significant for the BLOSUM50 and Gonnet92 matrices. Optimal scoring matrices and gap penalties are reported for Smith-Waterman and FASTA, using conventional or In()-scaled similarity scores. Searches with no penalty for gap extension, or no penalty for gap opening, or an infinite penalty for gaps performed significantly worse than the best methods. Differences in performance between FASTA and Smith-Waterman were not significant when partial query sequences were used. However, the best performance with complete query sequences was obtained with the Smith-Waterman algorithm and In()-scaling.


Assuntos
Algoritmos , Bases de Dados Factuais , Proteínas/genética , Alinhamento de Sequência/métodos , Homologia de Sequência de Aminoácidos , Sequência de Aminoácidos , Aminoácidos/química , Estudos de Avaliação como Assunto , Probabilidade , Análise de Regressão
16.
Artigo em Inglês | MEDLINE | ID: mdl-7584448

RESUMO

We address the problem of primer selection in polymerase chain reaction (PCR) experiments. We prove that the problem of minimizing the number of primers required to amplify a set of DNA sequences is NP-complete, and show that even approximating solutions to this problem to within a constant factor times optimal is intractable. On the practical side, we give a simple branch-and-bound algorithm that solves the primers minimization problem within reasonable time for typical instances. We present an efficient approximation scheme for this problem, and prove that our heuristic always produces solutions no worse than a logarithmic factor times the optimal, this being the best approximation possible within polynomial time. Finally, we analyze a weighted variant, where both the number of primers as well as the sum of their "costs" is optimized simultaneously. We conclude by presenting the empirical performance of our methods on biological data.


Assuntos
Algoritmos , Primers do DNA , DNA/química , DNA/metabolismo , Reação em Cadeia da Polimerase , Software , Sequência de Bases , Modelos Teóricos
17.
Protein Sci ; 3(3): 525-7, 1994 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-8019423

RESUMO

Although macrophage migration inhibitory factor (MIF) proteins conjugate glutathione, sequence analysis does not support their homology to other glutathione transferases. Glutathione transferases are not detected with MIF proteins in searches of protein sequence databases, and MIF proteins do not share significant sequence similarity with glutathione transferases. Homology cannot be demonstrated by multiple sequence alignment or evolutionary tree construction; such methods assume that the proteins being analyzed are homologous.


Assuntos
Glutationa Transferase/genética , Fatores Inibidores da Migração de Macrófagos/genética , Animais , Evolução Biológica , Bases de Dados Factuais , Humanos , Oxirredutases Intramoleculares , Camundongos , Proteínas/genética , Ratos , Alinhamento de Sequência , Homologia de Sequência de Aminoácidos
18.
Biochim Biophys Acta ; 1190(1): 189-92, 1994 Feb 23.
Artigo em Inglês | MEDLINE | ID: mdl-8110815

RESUMO

A cDNA encoding a beta-subunit of the avian H+/K(+)-ATPase was cloned from a chicken stomach cDNA library, and its nucleotide sequence determined. A comparison between all the available sequence data for the beta-subunits of P-type ATPases reveals several evolutionarily conserved regions. Overall identity was 66% when compared with mammalian H+/K(+)-ATPase beta-subunits, 34% identity when compared with the Na+/K(+)-ATPase beta 2-subunits, and 33% identity when compared with the Na+/K(+)-ATPase beta 1-subunits.


Assuntos
ATPase Trocadora de Hidrogênio-Potássio/química , Sequência de Aminoácidos , Animais , Sequência de Bases , Galinhas , DNA Complementar/análise , Dados de Sequência Molecular , Alinhamento de Sequência
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...