Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 1 de 1
Filter
Add more filters










Database
Language
Publication year range
1.
Int J Data Min Bioinform ; 13(2): 141-57, 2015.
Article in English | MEDLINE | ID: mdl-26547972

ABSTRACT

Analysing and classifying sequences based on similarities and differences is a mathematical problem of escalating relevance and importance in many scientific disciplines. One of the primary challenges in applying machine learning algorithms to sequential data, such as biological sequences, is the extraction and representation of significant features from the data. To address this problem, we have recently developed a representation, entitled Multi-Layered Vector Spaces (MLVS), which is a simple mathematical model that maps sequences into a set of MLVS. We demonstrate the usefulness of the model by applying it to the problem of identifying signal peptides. MLVS feature vectors are generated from a collection of protein sequences and the resulting vectors are used to create support vector machine classifiers. Experiments show that the MLVS-based classifiers are able to outperform or perform on par with several existing methods that are specifically designed for the purpose of identifying signal peptides.


Subject(s)
Algorithms , Databases, Protein , Peptides/chemistry , Protein Sorting Signals , Sequence Analysis, Protein/methods , Support Vector Machine , Amino Acid Sequence , Data Mining/methods , Molecular Sequence Data , Pattern Recognition, Automated/methods , Sequence Alignment/methods
SELECTION OF CITATIONS
SEARCH DETAIL
...