Search | VHL Regional Portal

Protocol to analyze the bacterial pangenome using PAN2HGENE software.

de Sá, Pablo Henrique Caracciolo Gomes; Castro Alves, Jorianne Thyeska; Veras, Adonney Allan de Oliveira.

STAR Protoc ; 3(2): 101327, 2022 06 17.

Article in English | MEDLINE | ID: mdl-35479110

ABSTRACT

The PAN2HGENE is a computational tool that enables two main analyses. First, the tool can identify gene products absent from the original prokaryotic genome sequence. Second, it enables automated comparative analysis for both complete and draft genomes. All analyses are performed through a simple and intuitive graphical user interface without the need for extensive and complex command lines. For complete details on the use and execution of this protocol, please refer to Silva de Oliveira (2021).

Subject(s)

Bacteria , Software , Genome , Prokaryotic Cells

PAN2HGENE-tool for comparative analysis and identifying new gene products.

Silva de Oliveira, Mônica; Thyeska Castro Alves, Jorianne; Henrique Caracciolo Gomes de Sá, Pablo; Veras, Adonney Allan de Oliveira.

PLoS One ; 16(5): e0252414, 2021.

Article in English | MEDLINE | ID: mdl-34048479

ABSTRACT

Advances in next-generation sequencing (NGS) platforms have had a positive impact on biological research, leading to the development of numerous omics approaches, including genomics, transcriptomics, metagenomics, and pangenomics. These analyses provide insights into the gene contents of various organisms. However, to understand the evolutionary processes of these genes, comparative analysis, which is an important tool for annotation, is required. Using comparative analysis, it is possible to infer the functions of gene contents and identify orthologs and paralogous genes via their homology. Although several comparative analysis tools currently exist, most of them are limited to complete genomes. PAN2HGENE, a computational tool that allows identification of gene products missing from the original genome sequence, with automated comparative analysis for both complete and draft genomes, can be used to address this limitation. In this study, PAN2HGENE was used to identify new products, resulting in altering the alpha value behavior in the pangenome without altering the original genomic sequence. Our findings indicate that this tool represents an efficient alternative for comparative analysis, with a simple and intuitive graphical interface. The PAN2HGENE have been uploaded to SourceForge and are available via: https://sourceforge.net/projects/pan2hgene-software.

Subject(s)

Computational Biology/methods , Software , Genomics/methods , High-Throughput Nucleotide Sequencing , Metagenomics , Transcriptome

CODON-Software to manual curation of prokaryotic genomes.

Merlin, Bruno; Castro Alves, Jorianne Thyeska; de Sá, Pablo Henrique Caracciolo Gomes; de Oliveira, Mônica Silva; Dias, Larissa Maranhão; da Silva Moia, Gislenne; Cardoso Dos Santos, Victória; Veras, Adonney Allan de Oliveira.

PLoS Comput Biol ; 17(3): e1008797, 2021 03.

Article in English | MEDLINE | ID: mdl-33788829

ABSTRACT

Genome annotation conceptually consists of inferring and assigning biological information to gene products. Over the years, numerous pipelines and computational tools have been developed aiming to automate this task and assist researchers in gaining knowledge about target genes of study. However, even with these technological advances, manual annotation or manual curation is necessary, where the information attributed to the gene products is verified and enriched. Despite being called the gold standard process for depositing data in a biological database, the task of manual curation requires significant time and effort from researchers who sometimes have to parse through numerous products in various public databases. To assist with this problem, we present CODON, a tool for manual curation of genomic data, capable of performing the prediction and annotation process. This software makes use of a finite state machine in the prediction process and automatically annotates products based on information obtained from the Uniprot database. CODON is equipped with a simple and intuitive graphic interface that assists on manual curation, enabling the user to decide about the analysis based on information as to identity, length of the alignment, and name of the organism in which the product obtained a match. Further, visual analysis of all matches found in the database is possible, impacting significantly in the curation task considering that the user has at his disposal all the information available for a given product. An analysis performed on eleven organisms was used to test the efficiency of this tool by comparing the results of prediction and annotation through CODON to ones from the NCBI and RAST platforms.

Subject(s)

Bacteria/genetics , Genomics/methods , Molecular Sequence Annotation/methods , Software , Databases, Genetic , User-Computer Interface

NGSReadsTreatment - A Cuckoo Filter-based Tool for Removing Duplicate Reads in NGS Data.

Gaia, Antonio Sérgio Cruz; de Sá, Pablo Henrique Caracciolo Gomes; de Oliveira, Mônica Silva; Veras, Adonney Allan de Oliveira.

Sci Rep ; 9(1): 11681, 2019 08 12.

Article in English | MEDLINE | ID: mdl-31406180

ABSTRACT

The Next-Generation Sequencing (NGS) platforms provide a major approach to obtaining millions of short reads from samples. NGS has been used in a wide range of analyses, such as for determining genome sequences, analyzing evolutionary processes, identifying gene expression and resolving metagenomic analyses. Usually, the quality of NGS data impacts the final study conclusions. Moreover, quality assessment is generally considered the first step in data analyses to ensure the use of only reliable reads for further studies. In NGS platforms, the presence of duplicated reads (redundancy) that are usually introduced during library sequencing is a major issue. These might have a serious impact on research application, as redundancies in reads can lead to difficulties in subsequent analysis (e.g., de novo genome assembly). Herein, we present NGSReadsTreatment, a computational tool for the removal of duplicated reads in paired-end or single-end datasets. NGSReadsTreatment can handle reads from any platform with the same or different sequence lengths. Using the probabilistic structure Cuckoo Filter, the redundant reads are identified and removed by comparing the reads with themselves. Thus, no prerequisite is required beyond the set of reads. NGSReadsTreatment was compared with other redundancy removal tools in analyzing different sets of reads. The results demonstrated that NGSReadsTreatment was better than the other tools in both the amount of redundancies removed and the use of computational memory for all analyses performed. Available in https://sourceforge.net/projects/ngsreadstreatment/ .

Subject(s)

Algorithms , DNA, Bacterial/genetics , DNA, Fungal/genetics , Sequence Analysis, DNA/statistics & numerical data , Software , Arcobacter/genetics , Escherichia coli/genetics , High-Throughput Nucleotide Sequencing/methods , Humans , Internet , Mycobacterium tuberculosis/genetics

ImproveAssembly - Tool for identifying new gene products and improving genome assembly.

Veras, Adonney Allan de Oliveira; Merlin, Bruno; de Sá, Pablo Henrique Caracciolo Gomes.

PLoS One ; 13(10): e0206000, 2018.

Article in English | MEDLINE | ID: mdl-30365512

ABSTRACT

The availability of biological information in public databases has increased exponentially. To ensure the accuracy of this information, researchers have adopted several methods and refinements to avoid the dissemination of incorrect information; for example, several automated tools are available for annotation processes. However, manual curation ensures and enriches biological information. Additionally, the genomic finishing process is complex, resulting in increased deposition of drafts genomes. This introduces bias in other omics analyses because incomplete genomic content is used. This is also observed for complete genomes. For example, genomes generated by reference assembly may not include new products in the new sequence or errors or bias can occur during the assembly process. Thus, we developed ImproveAssembly, a tool capable of identifying new products missing from genomic sequences, which can be used for complete and draft genomes. The identified products can improve the annotation of complete genomes and drafts while significantly reducing the bias when the information is used in other omics analyses.

Subject(s)

Genome , Sequence Analysis, DNA/methods , Software , Escherichia coli/genetics , Genetic Loci , Reproducibility of Results , Workflow

Draft genome sequence of Psychrobacter sp. ENNN9_III, a strain isolated from water in a polluted temperate estuarine system (Ria de Aveiro, Portugal).

Gomes, Jaqueline Conceição Meireles; Azevedo, Juliana Simão Nina de; Veras, Adonney Allan de Oliveira; Alves, Jorianne Thyeska Castro; Henriques, Isabel; Correia, António; Silva, Artur Luiz da Costa da; Carneiro, Adriana Ribeiro.

Genom Data ; 8: 21-4, 2016 Jun.

Article in English | MEDLINE | ID: mdl-27114904

ABSTRACT

The genus Psychrobacter includes Gram-negative coccobacilli that are non-pigmented, oxidase-positive, non-motile, psychrophilic or psychrotolerant, and halotolerant. Psychrobacter strain ENNN9_III was isolated from water in a polluted temperate estuarine system, contaminated with hydrocarbons and heavy metals. The genome has a G + C content of 42.7%, 2618 open reading frames (ORFs), three copies of the rRNAs operon, and 29 tRNA genes. Twenty-five sequences related to the degradation of aromatic compounds were predicted, as well as numerous genes related to resistance to metals or metal(loid)s. The genome sequence of Psychrobacter strain ENNN9_III provides the groundwork for further elucidation of the mechanisms of metal resistance and aromatic compounds degradation. Future studies are needed to confirm the usefulness of this strain for bioremediation proposes.

AutoAssemblyD: a graphical user interface system for several genome assemblers.

Veras, Adonney Allan de Oliveira; de Sá, Pablo Henrique Caracciolo Gomes; Azevedo, Vasco; Silva, Artur; Ramos, Rommel Thiago Jucá.

Bioinformation ; 9(16): 840-1, 2013.

Article in English | MEDLINE | ID: mdl-24143057

ABSTRACT

UNLABELLED: Next-generation sequencing technologies have increased the amount of biological data generated. Thus, bioinformatics has become important because new methods and algorithms are necessary to manipulate and process such data. However, certain challenges have emerged, such as genome assembly using short reads and high-throughput platforms. In this context, several algorithms have been developed, such as Velvet, Abyss, Euler-SR, Mira, Edna, Maq, SHRiMP, Newbler, ALLPATHS, Bowtie and BWA. However, most such assemblers do not have a graphical interface, which makes their use difficult for users without computing experience given the complexity of the assembler syntax. Thus, to make the operation of such assemblers accessible to users without a computing background, we developed AutoAssemblyD, which is a graphical tool for genome assembly submission and remote management by multiple assemblers through XML templates. AVAILABILITY: AssemblyD is freely available at https://sourceforge.net/projects/autoassemblyd. It requires Sun jdk 6 or higher.

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL