Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
Add more filters










Database
Language
Publication year range
1.
Genomics Proteomics Bioinformatics ; 16(5): 373-381, 2018 10.
Article in English | MEDLINE | ID: mdl-30583062

ABSTRACT

The rapid development of high-throughput sequencing technologies has led to a dramatic decrease in the money and time required for de novo genome sequencing or genome resequencing projects, with new genome sequences constantly released every week. Among such projects, the plethora of updated genome assemblies induces the requirement of version-dependent annotation files and other compatible public dataset for downstream analysis. To handle these tasks in an efficient manner, we developed the reference-based genome assembly and annotation tool (RGAAT), a flexible toolkit for resequencing-based consensus building and annotation update. RGAAT can detect sequence variants with comparable precision, specificity, and sensitivity to GATK and with higher precision and specificity than Freebayes and SAMtools on four DNA-seq datasets tested in this study. RGAAT can also identify sequence variants based on cross-cultivar or cross-version genomic alignments. Unlike GATK and SAMtools/BCFtools, RGAAT builds the consensus sequence by taking into account the true allele frequency. Finally, RGAAT generates a coordinate conversion file between the reference and query genomes using sequence variants and supports annotation file transfer. Compared to the rapid annotation transfer tool (RATT), RGAAT displays better performance characteristics for annotation transfer between different genome assemblies, strains, and species. In addition, RGAAT can be used for genome modification, genome comparison, and coordinate conversion. RGAAT is available at https://sourceforge.net/projects/rgaat/ and https://github.com/wushyer/RGAAT_v2 at no cost.


Subject(s)
Genome , High-Throughput Nucleotide Sequencing/methods , Sequence Analysis, DNA/methods , Software , Genomics , High-Throughput Nucleotide Sequencing/standards , Humans , Reference Standards , Sequence Analysis, DNA/standards
2.
J Comput Biol ; 25(5): 509-516, 2018 05.
Article in English | MEDLINE | ID: mdl-29641228

ABSTRACT

RNA editing is a post-transcriptional or cotranscriptional process that changes the sequence of the precursor transcript by substitutions, insertions, or deletions. Almost all of the land plants undergo RNA editing in organelles (plastids and mitochondria). Although several software tools have been developed to identify RNA editing events, there has been a great challenge to distinguish true RNA editing events from genome variation, sequencing errors, and other factors. Here we introduce REDO, a comprehensive application tool for identifying RNA editing events in plant organelles based on variant call format files from RNA-sequencing data. REDO is a suite of Perl scripts that illustrate a bunch of attributes of RNA editing events in figures and tables. REDO can also detect RNA editing events in multiple samples simultaneously and identify the significant differential proportion of RNA editing loci. Comparing with similar tools, such as REDItools, REDO runs faster with higher accuracy, and more specificity at the cost of slightly lower sensitivity. Moreover, REDO annotates each RNA editing site in RNAs, whereas REDItools reports only possible RNA editing sites in genome, which need additional steps to obtain RNA editing profiles for RNAs. Overall, REDO can identify potential RNA editing sites easily and provide several functions such as detailed annotations, statistics, figures, and significantly differential proportion of RNA editing sites among different samples.


Subject(s)
Genetic Variation , Organelles/genetics , Plant Proteins/genetics , RNA Editing , RNA, Plant/genetics , Software , Arabidopsis/genetics , Cocos/genetics , Computational Biology/methods , High-Throughput Nucleotide Sequencing/methods
3.
Methods Mol Biol ; 1638: 339-351, 2017.
Article in English | MEDLINE | ID: mdl-28755233

ABSTRACT

MicroRNAs (miRNAs) are small endogenous noncoding RNAs. Plant miRNAs are known to play important regulatory roles in homeostasis, stress response, and diverse developmental processes. Here, we describe the identification of conserved miRNAs in date palm (Phoenix dactylifera L.) based on transcriptomic data acquired across multistage fruit development and genome sequences, which include 238 plant conserved miRNAs and 276 novel P. dactylifera-specific miRNAs.


Subject(s)
Fruit/genetics , MicroRNAs/genetics , Phoeniceae/genetics , RNA, Plant/genetics , Base Sequence , Conserved Sequence/genetics , Gene Expression Regulation, Plant/genetics , Genes, Plant/genetics , Genome, Plant/genetics , High-Throughput Nucleotide Sequencing/methods , Transcriptome/genetics
4.
BMC Bioinformatics ; 18(1): 320, 2017 Jun 28.
Article in English | MEDLINE | ID: mdl-28659141

ABSTRACT

BACKGROUND: Exon recognition and splicing precisely and efficiently by spliceosome is the key to generate mature mRNAs. About one third or a half of disease-related mutations affect RNA splicing. Software PVAAS has been developed to identify variants associated with aberrant splicing by directly using RNA-seq data. However, it bases on the assumption that annotated splicing site is normal splicing, which is not true in fact. RESULTS: We develop the ISVASE, a tool for specifically identifying sequence variants associated with splicing events (SVASE) by using RNA-seq data. Comparing with PVAAS, our tool has several advantages, such as multi-pass stringent rule-dependent filters and statistical filters, only using split-reads, independent sequence variant identification in each part of splicing (junction), sequence variant detection for both of known and novel splicing event, additional exon-exon junction shift event detection if known splicing events provided, splicing signal evaluation, known DNA mutation and/or RNA editing data supported, higher precision and consistency, and short running time. Using a realistic RNA-seq dataset, we performed a case study to illustrate the functionality and effectiveness of our method. Moreover, the output of SVASEs can be used for downstream analysis such as splicing regulatory element study and sequence variant functional analysis. CONCLUSIONS: ISVASE is useful for researchers interested in sequence variants (DNA mutation and/or RNA editing) associated with splicing events. The package is freely available at https://sourceforge.net/projects/isvase/ .


Subject(s)
RNA Splicing , RNA/chemistry , User-Computer Interface , Base Sequence , Humans , Internet , RNA/genetics , RNA Editing
5.
PLoS One ; 11(10): e0163990, 2016.
Article in English | MEDLINE | ID: mdl-27736909

ABSTRACT

Coconut (Cocos nucifera L.), a member of the palm family (Arecaceae), is one of the most economically important crops in tropics, serving as an important source of food, drink, fuel, medicine, and construction material. Here we report an assembly of the coconut (C. nucifera, Oman local Tall cultivar) mitochondrial (mt) genome based on next-generation sequencing data. This genome, 678,653bp in length and 45.5% in GC content, encodes 72 proteins, 9 pseudogenes, 23 tRNAs, and 3 ribosomal RNAs. Within the assembly, we find that the chloroplast (cp) derived regions account for 5.07% of the total assembly length, including 13 proteins, 2 pseudogenes, and 11 tRNAs. The mt genome has a relatively large fraction of repeat content (17.26%), including both forward (tandem) and inverted (palindromic) repeats. Sequence variation analysis shows that the Ti/Tv ratio of the mt genome is lower as compared to that of the nuclear genome and neutral expectation. By combining public RNA-Seq data for coconut, we identify 734 RNA editing sites supported by at least two datasets. In summary, our data provides the second complete mt genome sequence in the family Arecaceae, essential for further investigations on mitochondrial biology of seed plants.


Subject(s)
Cocos/genetics , Genome, Mitochondrial , Genome, Plant , DNA, Plant/genetics , Phylogeny , Plant Proteins/genetics , Pseudogenes , RNA, Plant/genetics , Sequence Analysis, DNA , Transcriptome
6.
Gene ; 576(1 Pt 3): 560-70, 2016 Jan 15.
Article in English | MEDLINE | ID: mdl-26551299

ABSTRACT

Recently, RNA-seq has become widely used technology for transcriptome profiling due to its single-base accuracy and high-throughput speciality. In this study, we applied a computational approach on an integrated RNA-seq dataset across 15 normal mouse tissues, and consequently assigned 8408 house-keeping (HK) genes and 2581 tissue-specific (TS) genes among UCSC RefGene annotation. Apart from some basic genomic features, we also performed expression, function and pathway analysis with clustering, DAVID and Ingenuity Pathway Analysis, indicating the physiological connections (tissues) and diverse biological roles of HK genes (fundamental processes) and TS genes (tissue-corresponding processes). Moreover, we used RT-PCR method to test 18 candidate HK genes and finally identified a novel list of highly stable internal control genes: Ywhae, Ddb 1, Eif4h, etc. In summary, this study provides a new HK gene and TS gene resource for further genetic and evolution research and helps us better understand morphogenesis and biological diversity in mouse.


Subject(s)
Genes, Essential , Sequence Analysis, RNA , Animals , Gene Expression Regulation , Mice
SELECTION OF CITATIONS
SEARCH DETAIL
...