Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 2 de 2
Filter
Add more filters










Database
Language
Publication year range
1.
Proc Natl Acad Sci U S A ; 102(18): 6395-400, 2005 May 03.
Article in English | MEDLINE | ID: mdl-15851683

ABSTRACT

Biological sequences are composed of long strings of alphabetic letters rather than arrays of numerical values. Lack of a natural underlying metric for comparing such alphabetic data significantly inhibits sophisticated statistical analyses of sequences, modeling structural and functional aspects of proteins, and related problems. Herein, we use multivariate statistical analyses on almost 500 amino acid attributes to produce a small set of highly interpretable numeric patterns of amino acid variability. These high-dimensional attribute data are summarized by five multidimensional patterns of attribute covariation that reflect polarity, secondary structure, molecular volume, codon diversity, and electrostatic charge. Numerical scores for each amino acid then transform amino acid sequences for statistical analyses. Relationships between transformed data and amino acid substitution matrices show significant associations for polarity and codon diversity scores. Transformed alphabetic data are used in analysis of variance and discriminant analysis to study DNA binding in the basic helix-loop-helix proteins. The transformed scores offer a general solution for analyzing a wide variety of sequence analysis problems.


Subject(s)
Amino Acid Sequence/genetics , Computational Biology/methods , Genetic Variation , Models, Genetic , Phylogeny , Statistics as Topic/methods , Analysis of Variance , Cluster Analysis , Codon/genetics , Discriminant Analysis , Multivariate Analysis , Protein Conformation , Static Electricity
2.
Bioinformatics ; 21(7): 853-9, 2005 Apr 01.
Article in English | MEDLINE | ID: mdl-15514001

ABSTRACT

SUMMARY: We provide the graphical tool BACCardI for the construction of virtual clone maps from standard assembler output files or BLAST based sequence comparisons. This new tool has been applied to numerous genome projects to solve various problems including (a) validation of whole genome shotgun assemblies, (b) support for contig ordering in the finishing phase of a genome project, and (c) intergenome comparison between related strains when only one of the strains has been sequenced and a large insert library is available for the other. The BACCardI software can seamlessly interact with various sequence assembly packages. MOTIVATION: Genomic assemblies generated from sequence information need to be validated by independent methods such as physical maps. The time-consuming task of building physical maps can be circumvented by virtual clone maps derived from read pair information of large insert libraries.


Subject(s)
Algorithms , Chromosome Mapping/methods , Computer Graphics , Sequence Alignment/methods , Sequence Analysis, DNA/methods , Software , User-Computer Interface , Base Sequence , Benchmarking/methods , Molecular Sequence Data
SELECTION OF CITATIONS
SEARCH DETAIL
...