Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 3 de 3
Filter
Add more filters










Database
Language
Publication year range
1.
J Comput Biol ; 30(2): 117, 2023 Feb.
Article in English | MEDLINE | ID: mdl-36656165
2.
BMC Bioinformatics ; 22(1): 51, 2021 Feb 06.
Article in English | MEDLINE | ID: mdl-33549041

ABSTRACT

BACKGROUND: An inverted repeat is a DNA sequence followed downstream by its reverse complement, potentially with a gap in the centre. Inverted repeats are found in both prokaryotic and eukaryotic genomes and they have been linked with countless possible functions. Many international consortia provide a comprehensive description of common genetic variation making alternative sequence representations, such as IUPAC encoding, necessary for leveraging the full potential of such broad variation datasets. RESULTS: We present IUPACPAL, an exact tool for efficient identification of inverted repeats in IUPAC-encoded DNA sequences allowing also for potential mismatches and gaps in the inverted repeats. CONCLUSION: Within the parameters that were tested, our experimental results show that IUPACPAL compares favourably to a similar application packaged with EMBOSS. We show that IUPACPAL identifies many previously unidentified inverted repeats when compared with EMBOSS, and that this is also performed with orders of magnitude improved speed.


Subject(s)
Genome , Prokaryotic Cells , Repetitive Sequences, Nucleic Acid , Base Sequence , Inverted Repeat Sequences , Repetitive Sequences, Nucleic Acid/genetics
3.
Bioinformatics ; 36(12): 3687-3692, 2020 06 01.
Article in English | MEDLINE | ID: mdl-32246826

ABSTRACT

MOTIVATION: Computing the uniqueness of k-mers for each position of a genome while allowing for up to e mismatches is computationally challenging. However, it is crucial for many biological applications such as the design of guide RNA for CRISPR experiments. More formally, the uniqueness or (k, e)-mappability can be described for every position as the reciprocal value of how often this k-mer occurs approximately in the genome, i.e. with up to e mismatches. RESULTS: We present a fast method GenMap to compute the (k, e)-mappability. We extend the mappability algorithm, such that it can also be computed across multiple genomes where a k-mer occurrence is only counted once per genome. This allows for the computation of marker sequences or finding candidates for probe design by identifying approximate k-mers that are unique to a genome or that are present in all genomes. GenMap supports different formats such as binary output, wig and bed files as well as csv files to export the location of all approximate k-mers for each genomic position. AVAILABILITY AND IMPLEMENTATION: GenMap can be installed via bioconda. Binaries and C++ source code are available on https://github.com/cpockrandt/genmap.


Subject(s)
Genome , Software , Algorithms , Genomics , Sequence Analysis, DNA
SELECTION OF CITATIONS
SEARCH DETAIL
...