Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 10 de 10
Filter
1.
Acta Pharmaceutica Sinica B ; (6): 374-382, 2020.
Article in English | WPRIM | ID: wpr-787622

ABSTRACT

Background@# () (2n = 2x = 16) is genus of flowering plants belonging to the Gelsemicaeae family.@*Method@#Here, a high-quality genome assembly using the Oxford Nanopore Technologies (ONT) platform and high-throughput chromosome conformation capture techniques (Hi-C) were used.@*Results@#A total of 56.11 Gb of raw GridION X5 platform ONT reads (6.23 Gb per cell) were generated. After filtering, 53.45 Gb of clean reads were obtained, giving 160 × coverage depth. The genome assemblies 335.13 Mb, close to the 338 Mb estimated by k-mer analysis, was generated with contig N50 of 10.23 Mb. The vast majority (99.2%) of the assembled sequence was anchored onto 8 pseudo-chromosomes. The genome completeness was then evaluated and 1338 of the 1440 conserved genes (92.9%) could be found in the assembly. Genome annotation revealed that 43.16% of the genome is composed of repetitive elements and 23.9% is composed of long terminal repeat elements. We predicted 26,768 protein-coding genes, of which 84.56% were functionally annotated.@*Conclusion@#The genomic sequences of could be a valuable source for comparative genomic analysis in the Gelsemicaeae family and will be useful for understanding the phylogenetic relationships of the indole alkaloid metabolism.

2.
J Genet ; 2019 Aug; 98: 1-12
Article | IMSEAR | ID: sea-215404

ABSTRACT

Camelus dromedarius has played a pivotal role in both culture and way of life in the Arabian peninsula, particularly in arid regions where other domestic animals cannot be easily domesticated. Although, the mitochondrial genomes have recently been sequenced for several camelid species, wider phylogenetic studies are yet to be performed. The features of conserved gene elements, rapid evolutionary rate, and rare recombination make the mitochondrial genome a useful molecular marker for phylogenetic studies of closely related species. Here we carried out a comparative analysis of previously sequenced mitochondrial genomes of camelids with an emphasis on C. dromedarius, revealing a number of noticeable findings. First, the arrangement of mitochondrial genes in C. dromedarius is similar to those of the other camelids. Second, multiple sequence alignment of intergenic regions shows up to 90% similarity across different kinds of camels, with dromedary camels to reach 99%. Third, we successfully identified the three domains (termination-associated sequence, conserved domain and conserved sequence block) of the control region structure. The phylogenetic tree analysis showed that C. dromedarius mitogenomes were significantly clustered in the same clade with Lama pacos mitogenome. These findings will enhance our understanding of the nucleotide composition and molecular evolution of the mitogenomes of the genus Camelus, and provide more data for comparative mitogenomics in the family Camelidae.

3.
Genomics, Proteomics & Bioinformatics ; (4): 305-310, 2019.
Article in English | WPRIM | ID: wpr-772935

ABSTRACT

Published genomes frequently contain erroneous gene models that represent issues associated with identification of open reading frames, start sites, splice sites, and related structural features. The source of these inconsistencies is often traced back to integration across text file formats designed to describe long read alignments and predicted gene structures. In addition, the majority of gene prediction frameworks do not provide robust downstream filtering to remove problematic gene annotations, nor do they represent these annotations in a format consistent with current file standards. These frameworks also lack consideration for functional attributes, such as the presence or absence of protein domains that can be used for gene model validation. To provide oversight to the increasing number of published genome annotations, we present a software package, the Gene Filtering, Analysis, and Conversion (gFACs), to filter, analyze, and convert predicted gene models and alignments. The software operates across a wide range of alignment, analysis, and gene prediction files with a flexible framework for defining gene models with reliable structural and functional attributes. gFACs supports common downstream applications, including genome browsers, and generates extensive details on the filtering process, including distributions that can be visualized to further assess the proposed gene space. gFACs is freely available and implemented in Perl with support from BioPerl libraries at https://gitlab.com/PlantGenomicsLab/gFACs.

4.
Genomics, Proteomics & Bioinformatics ; (4): 373-381, 2018.
Article in English | WPRIM | ID: wpr-772966

ABSTRACT

The rapid development of high-throughput sequencing technologies has led to a dramatic decrease in the money and time required for de novo genome sequencing or genome resequencing projects, with new genome sequences constantly released every week. Among such projects, the plethora of updated genome assemblies induces the requirement of version-dependent annotation files and other compatible public dataset for downstream analysis. To handle these tasks in an efficient manner, we developed the reference-based genome assembly and annotation tool (RGAAT), a flexible toolkit for resequencing-based consensus building and annotation update. RGAAT can detect sequence variants with comparable precision, specificity, and sensitivity to GATK and with higher precision and specificity than Freebayes and SAMtools on four DNA-seq datasets tested in this study. RGAAT can also identify sequence variants based on cross-cultivar or cross-version genomic alignments. Unlike GATK and SAMtools/BCFtools, RGAAT builds the consensus sequence by taking into account the true allele frequency. Finally, RGAAT generates a coordinate conversion file between the reference and query genomes using sequence variants and supports annotation file transfer. Compared to the rapid annotation transfer tool (RATT), RGAAT displays better performance characteristics for annotation transfer between different genome assemblies, strains, and species. In addition, RGAAT can be used for genome modification, genome comparison, and coordinate conversion. RGAAT is available at https://sourceforge.net/projects/rgaat/ and https://github.com/wushyer/RGAAT_v2 at no cost.


Subject(s)
Humans , Genome , Genomics , High-Throughput Nucleotide Sequencing , Methods , Reference Standards , Reference Standards , Sequence Analysis, DNA , Methods , Reference Standards , Software
5.
Malaysian Journal of Microbiology ; : 273-279, 2017.
Article in English | WPRIM | ID: wpr-629107

ABSTRACT

Aims: Our previous study demonstrated that Klebsiella IIEMP-3 associated with tempeh was genetically different from those of medical isolates. In addition to the whole genome sequence of Klebsiella IIEMP-3, the draft genome sequence of another isolate, i.e. IWJB-6 was employed for comparison. In this study, the details of the virulence genes and unique gene in both Klebsiella isolates were compared employing in silico and in vitro analysis. Methodology and results: Whole genome of Klebsiella IIEMP-3 and IWJB-6 were annotated to investigate the virulence factor. Klebsiella IIEMP-3 and IWJB-6 were obtained from tempeh producers in Bogor, West Java - Indonesia. Genome sequences were analyzed employing BLAST Ring Image Generator (BRIG) software. The results showed that all of the samples, including isolates IIEMP-3 and IWJB-6 did not harbor rmpA, i.e. DNA sequence for K. pneumoniae virulence factor. Conclusion, significance and impact of study: Klebsiella could be found in almost all tempeh samples from Indonesia and could be harmless for human due to the absence of rmpA and other virulence-associated genes. The significance of this study showed that IIEMP-3 and IWJB-6 isolates were more closely related to K. variicola. However, K. variicola At22 harbored sdsA gene which is lacking in those two tempeh isolates. Combined with PCR analysis for specific gene/s; our study suggested that isolates from Indonesian tempeh were closely related to K. variicola, and proposed to be designated as K. variicola subsp. tempehensis.

6.
Chinese Journal of Medical Library and Information Science ; (12): 15-19, 2017.
Article in Chinese | WPRIM | ID: wpr-511112

ABSTRACT

Co-citations of highly cited papers on gene annotation covered in Web of Science were analyzed by clustering analysis using clustering software after the word matrix of resource literature and highly cited papers was formed, which showed that the application of text mining on gene annotation includes use of authorized tools, development of text mining tools and algorithms, and verification of text mining tools.

7.
Chinese Journal of Biotechnology ; (12): 1791-1801, 2017.
Article in Chinese | WPRIM | ID: wpr-243671

ABSTRACT

High-throughput biological technologies are now widely applied in biology and medicine, allowing scientists to monitor thousands of parameters simultaneously in a specific sample. However, it is still an enormous challenge to mine useful information from high-throughput data. The emergence of network biology provides deeper insights into complex bio-system and reveals the modularity in tissue/cellular networks. Correlation networks are increasingly used in bioinformatics applications. Weighted gene co-expression network analysis (WGCNA) tool can detect clusters of highly correlated genes. Therefore, we systematically reviewed the application of WGCNA in the study of disease diagnosis, pathogenesis and other related fields. First, we introduced principle, workflow, advantages and disadvantages of WGCNA. Second, we presented the application of WGCNA in disease, physiology, drug, evolution and genome annotation. Then, we indicated the application of WGCNA in newly developed high-throughput methods. We hope this review will help to promote the application of WGCNA in biomedicine research.

8.
Annals of Dentistry ; : 17-26, 2014.
Article in English | WPRIM | ID: wpr-732013

ABSTRACT

The gram-positive, mesophilic and non-motile coccus Streptococcus gordonii is an important causativeagent of infective endocarditis (IE). This pioneer species of dental plaque also causes bacteraemiain immune-supressed patients. In this study, we analysed the genome of a representative strain,Streptococcus gordonii SK12 that was originally isolated from the oral cavity. To gain a better understandingof the biology, virulence and phylogeny, of this potentially pathogenic organism, high-throughput IlluminaHiSeq technology and different bioinformatics approaches were performed. Genome assembly of SK12was performed using CLC Genomic Workbench 5.1.5 while RAST annotation revealed the key genomicfeatures. The assembled draft genome of Streptococcus gordonii SK12 consists of 27 contigs, with agenome size of 2,145,851 bp and a G+C content of 40.63%. Phylogenetic inferences have confirmedthat SK12 is closely related to the widely studied strain Streptococcus gordonii Challis. Interestingly, wepredicted 118 potential virulence genes in SK12 genome which may contribute to bacterial pathogenicityin infective endocarditis. We also discovered an intact prophage which might be recently integratedinto the SK12 genome. Examination of genes present in genomic islands revealed that this oral strainmight has potential to acquire new phenotypes/traits including strong defence system, bacitracinresistance and collateral detergent sensitivity. This detailed analysis of S. gordonii SK12 further improvesour understanding of the genetic make-up of S. gordonii as a whole and may help to elucidate howthis species is able to transition between living as an oral commensal and potentially causing the lifethreateningcondition infective endocarditis.

9.
Genet. mol. biol ; 35(1): 149-152, 2012. graf, tab
Article in English | LILACS | ID: lil-617006

ABSTRACT

The Xylella fastidiosa comparative genomic database is a scientific resource with the aim to provide a user-friendly interface for accessing high-quality manually curated genomic annotation and comparative sequence analysis, as well as for identifying and mapping prophage-like elements, a marked feature of Xylella genomes. Here we describe a database and tools for exploring the biology of this important plant pathogen. The hallmarks of this database are the high quality genomic annotation, the functional and comparative genomic analysis and the identification and mapping of prophage-like elements. It is available from web site http://www.xylella.lncc.br.


Subject(s)
Genome , Genomics , Interspersed Repetitive Sequences , Xylella
10.
Genomics & Informatics ; : 182-187, 2006.
Article in English | WPRIM | ID: wpr-91149

ABSTRACT

6-Phosphogluconolactonase (6PGL) is one of the key enzymes in the ubiquitous pathways of central carbon metabolism, but bacterial 6PGL had been long known as a missing enzyme even after complete bacterial genome sequence information became available. Although recent experimental characterization suggests that there are two types of 6PGLs (DevB and YbhE), their phylogenetic distribution is severely biased. Here we present that proteins in COG group previously described as 3-carboxymuconate cyclase (COG2706) are actually the YbhE-type 6PGLs, which are widely distributed in Proteobacteria and Firmicutes. This case exemplifies how erroneous functional description of a member in the reference database commonly used in transitive genome annotation cause systematic problem in the prediction of genes even with universal cellular functions.


Subject(s)
Bias , Carbon , Computer Simulation , Genome , Genome, Bacterial , Metabolism , Pentose Phosphate Pathway , Proteobacteria
SELECTION OF CITATIONS
SEARCH DETAIL