Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 2 de 2
Filter
Add more filters










Database
Main subject
Language
Publication year range
1.
Article in English | MEDLINE | ID: mdl-38801693

ABSTRACT

DNA motif is the pattern shared by similar fragments in DNA sequences, which plays a key role in regulating gene expression, and DNA motif discovery has become a key research topic. Exact planted (l,d)-motif search (PMS) is one of the motif discovery approaches, which aims to find from t sequences all the (l,d)-motifs that are motifs of l length appearing in at least qt sequences with at most d mismatches. The existing exact PMS algorithms are only suitable for small datasets of DNA sequences. The development of high-throughput sequencing technology generates vast amount of DNA sequence data, which brings challenges to solving exact PMS problems efficiently. Therefore, we propose an efficient exact PMS algorithm called PMmotif for large datasetsof DNA sequences, after analyzing the time complexity of the existing exact PMS algorithms. PMmotif finds (l,d) -motifs with strategy by searching the branches on the pattern tree that may contain (l,d) -motifs. It is verified by experiments that the running time ratio of the existing excellentPMS algorithmstoPMmotif isbetween14.83and 58.94. In addition, for the first time, PMmotif can solve the (15,5) and(17,6) challenge problem instances on large DNA sequence datasets within 24 hours.

2.
Article in English | MEDLINE | ID: mdl-35275822

ABSTRACT

A DNA motif is a sequence pattern shared by the DNA sequence segments that bind to a specific protein. Discovering motifs in a given DNA sequence dataset plays a vital role in studying gene expression regulation. As an important attribute of the DNA motif, the motif length directly affects the quality of the discovered motifs. How to determine the motif length more accurately remains a difficult challenge to be solved. We propose a new motif length prediction scheme named MotifLen by using supervised machine learning. First, a method of constructing sample data for predicting the motif length is proposed. Secondly, a deep learning model for motif length prediction is constructed based on the convolutional neural network. Then, the methods of applying the proposed prediction model based on a motif found by an existing motif discovery algorithm are given. The experimental results show that i) the prediction accuracy of MotifLen is more than 90% on the validation set and is significantly higher than that of the compared methods on real datasets, ii) MotifLen can successfully optimize the motifs found by the existing motif discovery algorithms, and iii) it can effectively improve the time performance of some existing motif discovery algorithms.


Subject(s)
Deep Learning , Nucleotide Motifs/genetics , Sequence Analysis, DNA/methods , Algorithms , Neural Networks, Computer
SELECTION OF CITATIONS
SEARCH DETAIL
...