RESUMO
Yerba mate (Ilex paraguariensis A. St.-Hil.) is an important subtropical tree crop cultivated on 326,000 ha in Argentina, Brazil and Paraguay, with a total yield production of more than 1,000,000 t. Yerba mate presents a strong limitation regarding sequence information. The NCBI GenBank lacks an EST database of yerba mate and depicts only 80 DNA sequences, mostly uncharacterized. In this scenario, in order to elucidate the yerba mate gene landscape by means of NGS, we explored and discovered a vast collection of I. paraguariensis transcripts. Total RNA from I. paraguariensis was sequenced by Illumina HiSeq-2000 obtaining 72,031,388 pair-end 100 bp sequences. High quality reads were de novo assembled into 44,907 transcripts encompassing 40 million bases with an estimated coverage of 180X. Multiple sequence analysis allowed us to predict that yerba mate contains â¼ 32,355 genes and 12,551 gene variants or isoforms. We identified and categorized members of more than 100 metabolic pathways. Overall, we have identified â¼ 1,000 putative transcription factors, genes involved in heat and oxidative stress, pathogen response, as well as disease resistance and hormone response. We have also identified, based in sequence homology searches, novel transcripts related to osmotic, drought, salinity and cold stress, senescence and early flowering. We have also pinpointed several members of the gene silencing pathway, and characterized the silencing effector Argonaute1. We predicted a diverse supply of putative microRNA precursors involved in developmental processes. We present here the first draft of the transcribed genomes of the yerba mate chloroplast and mitochondrion. The putative sequence and predicted structure of the caffeine synthase of yerba mate is presented. Moreover, we provide a collection of over 10,800 SSR accessible to the scientific community interested in yerba mate genetic improvement. This contribution broadly expands the limited knowledge of yerba mate genes, and is presented as the first genomic resource of this important crop.
Assuntos
Perfilação da Expressão Gênica , Genes de Plantas/genética , Sequenciamento de Nucleotídeos em Larga Escala , Ilex paraguariensis/genética , Ácido Clorogênico/metabolismo , Elementos de DNA Transponíveis/genética , DNA Intergênico/genética , Genômica , Ilex paraguariensis/enzimologia , Metiltransferases/genética , Repetições de Microssatélites/genética , Anotação de Sequência Molecular , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Análise de Sequência de RNARESUMO
We present the first report of a virus infecting the subtropical tree crop yerba mate (Ilex paraguariensis St. Hil.). Total RNA purification, followed by next-generation sequencing, transcripts assembly and annotation, resulted in the identification of a new endornavirus species infecting yerba mate. The complete sequence of the linear dsRNA viral genome is 13,954-nt long, contains a single 13,743 nt ORF, and presents a 149 nt 5'UTR and a 61 nt 3'UTR. The predicted ORF encodes a 4,581 aa polypeptide with a UDP-glucose glycosyl-transferase, a capsular polysaccharide synthesis protein, and a RNA-dependent RNA polymerase domain. The name yerba mate endornavirus is proposed for the identified virus. Due to the intriguing peculiarities of this virus family, and the complete lack of the yerba mate virus literature, we consider that the information reported here will be helpful in leading to a new and needed attention to this important topic and crop.