Your browser doesn't support javascript.
loading
Methodology for in silico mining of microsatellite polymorphic loci
Martínez Ortiz, Carlos M; Rivero Bandínez, Alejandro.
  • Martínez Ortiz, Carlos M; University of Medical Sciences, ICPB Victoria de Girón. Department of Biochemistry. Havana. CU
  • Rivero Bandínez, Alejandro; University of Medical Sciences, ICPB Victoria de Girón. Department of Biochemistry. Havana. CU
Rev. cuba. inform. méd ; 11(1)ene.-jun. 2019. graf
Article in English | LILACS, CUMED | ID: biblio-1093304
ABSTRACT
Polymorphisms with variable number of tandem repeats (VNTR), are genetic markers used in areas of genomics as evolutionary, epidemiological and population genetics studies. The growth of genomic sequences in data banks and the development of computational tools for bioinformatics allow the mining of these markers without the need to use experimental methods, extending the analysis to non-model organisms of medical or economic importance. Due to the low complexity of these sequences and the high number of candidates presented when inspecting one or several genomes in a scaled manner, difficulties arise in processing the volume of data that is generated and the detection of polymorphisms by visual inspection in candidate markers. A methodology and its algorithmic specificities are described, implemented in a software pipeline, which allow the fast and reliable identification of polymorphic SSRs loci. The global processing is done by the concatenation of the programs MIDAS, BLAST and the PSSR-Extractor script. The inputs are directory paths where multiple sequence files are found in FASTA or GBFF format and the outputs are the SSRs, access codes to the databases, positions in the genome, number of repetitions and the degree of polymorphism expressed as range of variation, allelic frequency, allele number and polymorphic information content (PIC). An optional script, SSRMerge, allows the identification of unique (non-redundant) loci in the set of processed genome sequences with taxonomically closed relationship. Twenty three complete genomes (RefSeq from NCBI) belonging to various isolates of Mycobacterium tuberculosis were processed, 4433 SSRs were detected and from them 414 non-redundant loci were extracted within the species. The polymorphisms for these SSRs were mined in the BLAST server outputs and different measures are reported that reflect loci variations(AU)
RESUMEN
Los polimorfismos con número variable de repeticiones en tándem (VNTR), constituyen marcadores genéticos utilizados en áreas de la genómica como estudios evolutivos, epidemiológicos y de genética poblacional. Los bancos de secuencias genómicas y las herramientas computacionales como BLAST permiten el minado de estos marcadores sin utilizar métodos experimentales, extendiéndolo a organismos no modelos de importancia médica o económica. Debido a la baja complejidad de estas secuencias y el número de candidatos que se presentan al inspeccionar un genoma cuando el procedimiento es escalado, surgen dificultades para procesar el volumen de datos generado y detectar por inspección visual los polimorfismos en los marcadores candidatos. Se presentan una metodología y varios software que permiten la identificación y extracción rápida y fiable de loci polimórficos de SSRs. El procesamiento se hace por la concatenación de los programas MIDAS, BLAST, y el script PSSR-Extractor. Las entradas son rutas de directorios donde se encuentren múltiples archivos de secuencia en formato FASTA o GBFF y las salidas son los SSRs, códigos de acceso al GenBank, posiciones en el genoma, número de repeticiones y el grado de polimorfismo expresado como rango de variación, frecuencia alélica, cantidad de alelos y contenido de información polimórfica (PIC). Un script opcional, SSRMerge, permite la identificación de loci únicos (no redundantes) a nivel de especie, de género o en general del conjunto las secuencias que se desee procesar. Se procesaron 23 genomas completos (RefSeq del NCBI) pertenecientes a diversos aislamientos de Mycobacterium tuberculosis. Se detectaron 4433 SSRs extrayéndose 414 loci no redundantes dentro de la especie. Realizado el minado de polimorfismos en las salidas del servidor BLAST para estos SSRs se reportan medidas que reflejan las variaciones que presentan estos loci(AU)
Subject(s)

Full text: Available Index: LILACS (Americas) Main subject: Algorithms / Software / Genetic Markers / Data Mining Type of study: Prognostic study Limits: Humans Language: English Journal: Rev. cuba. inform. méd Journal subject: Medical Informatics / Health Services Year: 2019 Type: Article Affiliation country: Cuba Institution/Affiliation country: University of Medical Sciences, ICPB Victoria de Girón/CU

Similar

MEDLINE

...
LILACS

LIS

Full text: Available Index: LILACS (Americas) Main subject: Algorithms / Software / Genetic Markers / Data Mining Type of study: Prognostic study Limits: Humans Language: English Journal: Rev. cuba. inform. méd Journal subject: Medical Informatics / Health Services Year: 2019 Type: Article Affiliation country: Cuba Institution/Affiliation country: University of Medical Sciences, ICPB Victoria de Girón/CU