Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 1 de 1
Filter
Add more filters










Database
Language
Publication year range
1.
Bioinformatics ; 19(1): 30-6, 2003 Jan.
Article in English | MEDLINE | ID: mdl-12499290

ABSTRACT

MOTIVATION: We propose representing individual positions in DNA sequences by virtual potentials generated by other bases of the same sequence. This is a compact representation of the neighbourhood of a base. The distribution of the virtual potentials over the whole sequence can be used as a representation of the entire sequence (SEQREP code). It is a flexible code, with a length independent of the sequence size, does not require previous alignment, and is convenient for processing by neural networks or statistical techniques. RESULTS: To evaluate its biological significance, the SEQREP code was used for training Kohonen self-organizing maps (SOMs) in two applications: (a) detection of Alu sequences, and (b) classification of sequences encoding for HIV-1 envelope glycoprotein (env) into subtypes A-G. It was demonstrated that SOMs clustered sequences belonging to different classes into distinct regions. For independent test sets, very high rates of correct predictions were obtained (97% in the first application, 91% in the second). Possible areas of application of SEQREP codes include functional genomics, phylogenetic analysis, detection of repetitions, database retrieval, and automatic alignment. AVAILABILITY: Software for representing sequences by SEQREP code, and for training Kohonen SOMs is made freely available from http://www.dq.fct.unl.pt/qoa/jas/seqrep. SUPPLEMENTARY INFORMATION: Supplementary material is available at http://www.dq.fct.unl.pt/qoa/jas/seqrep/bioinf2002


Subject(s)
Algorithms , DNA/classification , DNA/genetics , Information Storage and Retrieval/methods , Sequence Analysis, DNA/methods , Alu Elements/genetics , Base Sequence , Databases, Nucleic Acid , Gene Products, env/genetics , HIV Envelope Protein gp120/genetics , HIV-1/genetics , Humans , Molecular Sequence Data , Neural Networks, Computer , Sequence Homology, Nucleic Acid
SELECTION OF CITATIONS
SEARCH DETAIL
...