ABSTRACT
Trichoderma is recognized as a prolific producer of nonribosomal peptides (NRPs) known as peptaibols, which have remarkable biological properties, such as antimicrobial and anticancer activities, as well as the ability to promote systemic resistance in plants against pathogens. In this study, the sequencing of 11-, 14- and 15-res peptaibols produced by a marine strain of Trichoderma isolated from the ascidian Botrylloides giganteus was performed via liquid chromatography coupled to high-resolution tandem mass spectrometry (LC-MS/MS). Identification, based on multilocus phylogeny, revealed that our isolate belongs to the species T. endophyticum, which has never been reported in marine environments. Through genome sequencing and genome mining, 53 biosynthetic gene clusters (BGCs) were identified as being related to bioactive natural products, including two NRP-synthetases: one responsible for the biosynthesis of 11- and 14-res peptaibols, and another for the biosynthesis of 15-res. Substrate prediction, based on phylogeny of the adenylation domains in combination with molecular networking, permitted extensive annotation of the mass spectra related to two new series of 15-res peptaibols, which are referred to herein as "endophytins". The analyses of synteny revealed that the origin of the 15-module peptaibol synthetase is related to 18, 19 and 20-module peptaibol synthetases, and suggests that the loss of modules may be a mechanism used by Trichoderma species for peptaibol diversification. This study demonstrates the importance of combining genome mining techniques, mass spectrometry analysis and molecular networks for the discovery of new natural products.
ABSTRACT
The genes coding for Cytochrome P450 aromatase (cyp19a1a and cyp19a1b) and estrogen (E2) receptors (esr1, esr2a and esr2b) play a conserved role in ovarian differentiation and development among teleosts. Classically, the "gonad form" of aromatase, coded by the cyp19a1a, is responsible for the ovarian differentiation in genetic females via ligation and activation of the Esr, which mediates the endocrine and exocrine signaling to allow or block the establishment of the feminine phenotype. However, in neotropical species, studies on the molecular and endocrine processes involved in gonad differentiation as well as on the effects of sex modulators are recent and scarce. In this study, we combined in silico analysis, real-time quantitative PCR (qPCR) assay and quantification of E2 plasma levels of differentiating tambaqui (Colossoma macropomum) to unveil the roles of the paralogs cypa19a1a and cyp19a1b during sex differentiation. Although the synteny of each gene is very conserved among characids, the genomic environment displays striking differences in comparison to model teleost species, with many rearrangements in cyp19a1a and cyp19a1b adjacencies and transposable element traces in both regulatory regions. The high dissimilarity (DI) of SF-1 binding motifs in cyp19a1a (DI = 10.06 to 14.90 %) and cyp19a1b (DI = 8.41 to 13.50 %) regulatory region, respectively, may reflect in an alternative pathway in tambaqui. Indeed, while low transcription of cyp19a1a was detected prior to sex differentiation, the expression of cyp19a1b and esr2a presented a large variation at this phase, which could be associated with sex-specific differential expression. Histological analysis revealed that anti-estradiol treatments did not affect gonadal sex ratios, although Fadrozole (50 mg kg-1 of food) reduced E2 plasma levels (p < 0,005) as well cyp19a1a transcription; and tamoxifen (200 mg kg-1 of food) down regulated both cyp19a1a and cyp19a1b but did not influence E2 levels. Altogether, our results bring into light new insights about the evolutionary fate of cyp19a1 paralogs in neotropical fish, which may have generated uncommon roles for the gonadal and brain forms of cyp19a1 genes and the unexpected lack of effect of endocrine disruptors on tambaqui sexual differentiation.
Subject(s)
Aromatase , Characiformes , Animals , Aromatase/genetics , Aromatase/metabolism , Characiformes/genetics , Female , Gonads/metabolism , Male , Phylogeny , Sex Differentiation/geneticsABSTRACT
In comparative genomics, the study of synteny can be a powerful method for exploring genome rearrangements, inferring genomic ancestry, defining orthology relationships, determining gene and genome duplications, and inferring gene positional conservation patterns across taxa. In this chapter, we present a step-by-step protocol for microsynteny network (SynNet) analysis, as an alternative to traditional methods of synteny comparison, where nodes in the network represent protein-coding genes and edges represent the pairwise syntenic relationships. The SynNet pipeline consists of six main steps: (1) pairwise genome comparisons between all the genomes being analyzed, (2) detection of inter- and intrasynteny blocks, (3) generation of an entire synteny database (i.e., edgelist), (4) network clustering, (5) phylogenomic profiling of the gene family of interest, and (6) evolutionary inference. The SynNet approach facilitates the rapid analysis and visualization of synteny relationships (from specific genes, specific gene families up to all genes) across a large number of genomes.
Subject(s)
Genome , Genomics , Evolution, Molecular , Genomics/methods , Phylogeny , Plants/genetics , SyntenyABSTRACT
The identification of thalidomide-Cereblon-induced SALL4 degradation has brought new understanding for thalidomide embryopathy (TE) differences across species. Some questions, however, regarding species variability, still remain. The aim of this study was to detect sequence divergences between species, affected or not by TE, and to evaluate the regulated gene co-expression in a murine model. Here, we performed a comparative analysis of proteins experimentally established as affected by thalidomide exposure, evaluating 14 species. The comparative analysis, regarding synteny, neighborhood, and protein conservation, was performed in 42 selected genes. Differential co-expression analysis was performed, using a publicly available assay, GSE61306, which evaluated mouse embryonic stem cells (mESC) exposed to thalidomide. The comparative analyses evidenced 20 genes in the upstream neighborhood of NOS3, which are different between the species who develop, or not, the classic TE phenotype. Considering protein sequence alignments, RECQL4, SALL4, CDH5, KDR, and NOS2 proteins had the biggest number of variants reported in unaffected species. In co-expression analysis, Crbn was a gene identified as a driver of the co-expression of other genes implicated in genetic, non-teratogenic, limb reduction defects (LRD), such as Tbx5, Esco2, Recql4, and Sall4; Crbn and Sall4 were shown to have a moderate co-expression correlation, which is affected after thalidomide exposure. Hence, even though the classic TE phenotype is not identified in mice, a deregulatory Crbn-induced mechanism is suggested in this animal. Functional studies are necessary, especially evaluating the genes responsible for LRD syndromes and their interaction with thalidomide-Cereblon.
ABSTRACT
Ctenoluciidae (Characiformes), a family of freshwater fishes, comprises 2 genera, Ctenolucius and Boulengerella, with 7 recognized species. Up to now, only species of the genus Boulengerella have been subjected to cytogenetic studies. Here, we investigated the karyotype and other cytogenetic features of pike characin, Ctenolucius hujeta, using conventional (Giemsa staining, C-banding, Ag-NOR staining) and molecular (rDNA, telomeric sequences, and fiber-FISH mapping) procedures. This species has a diploid chromosome number of 2n = 36, and a karyotype composed of 12m + 20sm + 4a and FN = 68, similar to that found in Boulengerella species. However, differences regarding the number and distribution of several chromosomal markers support a distinct generic status. Colocalization of the 18S and 5S rDNA genes is an exclusive characteristic of the C. hujeta genome, with an interspersed distribution in the chromosomal fiber, an unusual phenomenon among eukaryotes. Additionally, our results support the view that Ctenoluciidae and Lebiasinidae families are closely related.
Subject(s)
Characiformes/genetics , Chromosomes/genetics , Cytogenetic Analysis/methods , Karyotyping/methods , Animals , Characiformes/classification , Chromosome Banding , Diploidy , Evolution, Molecular , Female , Genome/genetics , In Situ Hybridization, Fluorescence/methods , Karyotype , Male , RNA, Ribosomal, 18S/genetics , RNA, Ribosomal, 5S/genetics , Telomere/geneticsABSTRACT
The genome assembly of Anopheles darlingi consists of 2221 scaffolds (N50 = 115,072 bp) and has a size spanning 136.94 Mbp. This assembly represents one of the smallest genomes among Anopheles species. Anopheles darlingi genomic DNA fragments of ~37 Kb were cloned, end-sequenced, and used as probes for fluorescence in situ hybridization (FISH) with salivary gland polytene chromosomes. In total, we mapped nine DNA probes to scaffolds and autosomal arms. Comparative analysis of the An. darlingi scaffolds with homologous sequences of the Anopheles albimanus and Anopheles gambiae genomes identified chromosomal rearrangements among these species. Our results confirmed that physical mapping is a useful tool for anchoring genome assemblies to mosquito chromosomes.
ABSTRACT
Comparative cytogenetic mapping is a powerful approach to gain insights into genome organization of orphan crops, lacking a whole sequenced genome. To investigate the cytogenomic evolution of important Vigna and Phaseolus beans, we built a BAC-FISH (fluorescent in situ hybridization of bacterial artificial chromosome) map of Vigna aconitifolia (Vac, subgenus Ceratotropis), species with no sequenced genome, and compared with V. unguiculata (Vu, subgenus Vigna) and Phaseolus vulgaris (Pv) maps. Seventeen Pv BACs, eight Vu BACs, and 5S and 35S rDNA probes were hybridized in situ on the 11 Vac chromosome pairs. Five Vac chromosomes (Vac6, Vac7, Vac9, Vac10, and Vac11) showed conserved macrosynteny and collinearity between V. unguiculata and P. vulgaris. On the other hand, we observed collinearity breaks, identified by pericentric inversions involving Vac2 (Vu2), Vac4 (Vu4), and Vac3 (Pv3). We also detected macrosynteny breaks of translocation type involving chromosomes 1 and 8 of V. aconitifolia and P. vulgaris; 2 and 3 of V. aconitifolia and P. vulgaris; and 1 and 5 of V. aconitifolia and V. unguiculata. Considering our data and previous BAC-FISH studies, six chromosomes (1, 2, 3, 4, 5, and 8) are involved in major karyotype divergences between genera and five (1, 2, 3, 4, and 5) between Vigna subgenera, including mechanisms such as duplications, inversions, and translocations. Macrosynteny breaks between Vigna and Phaseolus suggest that the major chromosomal rearrangements have occurred within the Vigna clade. Our cytogenomic comparisons bring new light on the degree of shared macrosynteny and mechanisms of karyotype diversification during Vigna and Phaseolus evolution.
Subject(s)
Cytogenetics , Genomics , Phaseolus/genetics , Vigna/genetics , Chromosome Mapping , Chromosomes, Artificial, Bacterial , Chromosomes, Plant , Cytogenetics/methods , Genome, Plant , Genomics/methods , In Situ Hybridization, Fluorescence , Karyotype , KaryotypingABSTRACT
BACKGROUND: Most of our understanding on the social behavior and genomics of bees and other social insects is centered on the Western honey bee, Apis mellifera. The genus Apis, however, is a highly derived branch comprising less than a dozen species, four of which genomically characterized. In contrast, for the equally highly eusocial, yet taxonomically and biologically more diverse Meliponini, a full genome sequence was so far available for a single Melipona species only. We present here the genome sequence of Frieseomelitta varia, a stingless bee that has, as a peculiarity, a completely sterile worker caste. RESULTS: The assembly of 243,974,526 high quality Illumina reads resulted in a predicted assembled genome size of 275 Mb composed of 2173 scaffolds. A BUSCO analysis for the 10,526 predicted genes showed that these represent 96.6% of the expected hymenopteran orthologs. We also predicted 169,371 repetitive genomic components, 2083 putative transposable elements, and 1946 genes for non-coding RNAs, largely long non-coding RNAs. The mitochondrial genome comprises 15,144 bp, encoding 13 proteins, 22 tRNAs and 2 rRNAs. We observed considerable rearrangement in the mitochondrial gene order compared to other bees. For an in-depth analysis of genes related to social biology, we manually checked the annotations for 533 automatically predicted gene models, including 127 genes related to reproductive processes, 104 to development, and 174 immunity-related genes. We also performed specific searches for genes containing transcription factor domains and genes related to neurogenesis and chemosensory communication. CONCLUSIONS: The total genome size for F. varia is similar to the sequenced genomes of other bees. Using specific prediction methods, we identified a large number of repetitive genome components and long non-coding RNAs, which could provide the molecular basis for gene regulatory plasticity, including worker reproduction. The remarkable reshuffling in gene order in the mitochondrial genome suggests that stingless bees may be a hotspot for mtDNA evolution. Hence, while being just the second stingless bee genome sequenced, we expect that subsequent targeting of a selected set of species from this diverse clade of highly eusocial bees will reveal relevant evolutionary signals and trends related to eusociality in these important pollinators.
Subject(s)
Bees/physiology , Cell Nucleus/genetics , Computational Biology/methods , Mitochondria/genetics , Animals , Bees/classification , Bees/genetics , Behavior, Animal , Gene Order , Genome Size , Genome, Mitochondrial , High-Throughput Nucleotide Sequencing , Interspersed Repetitive Sequences , RNA, Long Noncoding/genetics , Social Behavior , Whole Genome SequencingABSTRACT
Schistosoma japonicum is a flatworm that causes schistosomiasis, a neglected tropical disease. S. japonicum RNA-Seq analyses has been previously reported in the literature on females and males obtained during sexual maturation from 14 to 28 days post-infection in mouse, resulting in the identification of protein-coding genes and pathways, whose expression levels were related to sexual development. However, this work did not include an analysis of long non-coding RNAs (lncRNAs). Here, we applied a pipeline to identify and annotate lncRNAs in 66 S. japonicum RNA-Seq publicly available libraries, from different life-cycle stages. We also performed co-expression analyses to find stage-specific lncRNAs possibly related to sexual maturation. We identified 12,291 S. japonicum expressed lncRNAs. Sequence similarity search and synteny conservation indicated that some 14% of S. japonicum intergenic lncRNAs have synteny conservation with S. mansoni intergenic lncRNAs. Co-expression analyses showed that lncRNAs and protein-coding genes in S. japonicum males and females have a dynamic co-expression throughout sexual maturation, showing differential expression between the sexes; the protein-coding genes were related to the nervous system development, lipid and drug metabolism, and overall parasite survival. Co-expression pattern suggests that lncRNAs possibly regulate these processes or are regulated by the same activation program as that of protein-coding genes.
ABSTRACT
Schistosoma japonicum is a flatworm that causes schistosomiasis, a neglected tropical disease. S. japonicum RNA-Seq analyses has been previously reported in the literature on females and males obtained during sexual maturation from 14 to 28 days post-infection in mouse, resulting in the identification of protein-coding genes and pathways, whose expression levels were related to sexual development. However, this work did not include an analysis of long non-coding RNAs (lncRNAs). Here, we applied a pipeline to identify and annotate lncRNAs in 66 S. japonicum RNA-Seq publicly available libraries, from different life-cycle stages. We also performed co-expression analyses to find stage-specific lncRNAs possibly related to sexual maturation. We identified 12,291 S. japonicum expressed lncRNAs. Sequence similarity search and synteny conservation indicated that some 14% of S. japonicum intergenic lncRNAs have synteny conservation with S. mansoni intergenic lncRNAs. Co-expression analyses showed that lncRNAs and protein-coding genes in S. japonicum males and females have a dynamic co-expression throughout sexual maturation, showing differential expression between the sexes; the protein-coding genes were related to the nervous system development, lipid and drug metabolism, and overall parasite survival. Co-expression pattern suggests that lncRNAs possibly regulate these processes or are regulated by the same activation program as that of protein-coding genes.
ABSTRACT
Eragrostis curvula (Schrad.) Nees (weeping lovegrass) is an apomictic species native to Southern Africa that is used as forage grass in semiarid regions of Argentina. Apomixis is a mechanism for clonal propagation through seeds that involves the avoidance of meiosis to generate an unreduced embryo sac (apomeiosis), parthenogenesis, and viable endosperm formation in a fertilization-dependent or -independent manner. Here, we constructed the first saturated linkage map of tetraploid E. curvula using both traditional (AFLP and SSR) and high-throughput molecular markers (GBS-SNP) and identified the locus controlling diplospory. We also identified putative regulatory regions affecting the expressivity of this trait and syntenic relationships with genomes of other grass species. We obtained a tetraploid mapping population from a cross between a full sexual genotype (OTA-S) with a facultative apomictic individual of cv. Don Walter. Phenotypic characterization of F1 hybrids by cytoembryological analysis yielded a 1:1 ratio of apomictic vs. sexual plants (34:27, X 2 = 0.37), which agrees with the model of inheritance of a single dominant genetic factor. The final number of markers was 1,114 for OTA-S and 2,019 for Don Walter. These markers were distributed into 40 linkage groups per parental genotype, which is consistent with the number of E. curvula chromosomes (containing 2 to 123 markers per linkage group). The total length of the OTA-S map was 1,335 cM, with an average marker density of 1.22 cM per marker. The Don Walter map was 1,976.2 cM, with an average marker density of 0.98 cM/marker. The locus responsible for diplospory was mapped on Don Walter linkage group 3, with other 65 markers. QTL analyses of the expressivity of diplospory in the F1 hybrids revealed the presence of two main QTLs, located 3.27 and 15 cM from the diplospory locus. Both QTLs explained 28.6% of phenotypic variation. Syntenic analysis allowed us to establish the groups of homologs/homeologs for each linkage map. The genetic linkage map reported in this study, the first such map for E. curvula, is the most saturated map for the genus Eragrostis and one of the most saturated maps for a polyploid forage grass species.
ABSTRACT
The WFDC1 gene is frequently down-regulated or lost in prostate cancer, and the encoded protein, ps20, has been implicated in epithelial cell behaviour and angiogenesis. However, ps20 remains largely uncharacterised with respect to its structure and interacting partners. This study characterised the evolution, functionality and structural characteristics of WFDC1/ps20 using phylogenetic reconstruction and other computational approaches. Bayesian phylogenetic analyses suggested that ps20 appeared in a common ancestor of deuterostomes-protostomes. The rate of evolutionary change within the coding regions of vertebrate WFDC1 genes and the synteny conservation in mammals differed from that of other vertebrate clades, indicating a possible functional diversity of ps20 homologues. A gene set enrichment analysis of the genes around WFDC1 (conserved synteny) showed functional relationships between the WFDC1, CDH13, CRISPLD2, IRF8 and TFPI2 genes. The molecular evolution of ps20 has been driven by purifying selection, particularly in the segments corresponding to exons 3 and 4, which encode the most conserved regions of the protein. A co-evolution analysis showed that residues within these regions co-vary with each other during the evolution of ps20. These results show that the regions corresponding to exons 3 and 4 are ps20-specific structure-function modules. Homology modelling of the exon 2-encoded polypeptide and subsequent dynamics calculus using a Gaussian network model showed that residues with high conformational flexibility are part of a loop region involved in protein-protein recognition, given the similarity with other serine protease inhibitors. Residues C96, R94, L105, and C66 are critical for the integrity and functionality of this ps20 region.
Subject(s)
Evolution, Molecular , Models, Molecular , Phylogeny , Proteins , Humans , Protein Domains , Proteins/chemistry , Proteins/genetics , Structural Homology, ProteinABSTRACT
Gracilariaceae has a worldwide distribution including numerous economically important species. We applied high-throughput sequencing to obtain organellar genomes (mitochondria and chloroplast) from 10 species of Gracilariaceae and, combined with published genomes, to infer phylogenies and compare genome architecture among species representing main lineages. We obtained similar topologies between chloroplast and mitochondrial genomes phylogenies. However, the chloroplast phylogeny was better resolved with full support. In this phylogeny, Melanthalia intermedia is sister to a monophyletic clade including Gracilaria and Gracilariopsis, which were both resolved as monophyletic genera. Mitochondrial and chloroplast genomes were highly conserved in gene synteny, and variation mainly occurred in regions where insertions of plasmid-derived sequences (PDS) were found. In mitochondrial genomes, PDS insertions were observed in two regions where the transcription direction changes: between the genes cob and trnL, and trnA and trnN. In chloroplast genomes, PDS insertions were in different positions, but generally found between psdD and rrs genes. Gracilariaceae is a good model system to study the impact of PDS in genome evolution due to the frequent presence of these insertions in organellar genomes. Furthermore, the bacterial leuC/leuD operon was found in chloroplast genomes of Gracilaria tenuistipitata, G. chilensis, and M. intermedia, and in extrachromosomal plasmid of G. vermiculophylla. Phylogenetic trees show two different origins of leuC/leuD: genes found in chloroplast and plasmid were placed with proteobacteria, and genes encoded in the nucleus were close to Viridiplantae and cyanobacteria.
Subject(s)
Evolution, Molecular , Genome, Chloroplast/genetics , Genome, Mitochondrial/genetics , Rhodophyta/genetics , Phylogeny , Sequence Analysis, DNAABSTRACT
The Transmembrane BAX Inhibitor Motif containing (TMBIM) superfamily, divided into BAX Inhibitor (BI) and Lifeguard (LFG) families, comprises a group of cytoprotective cell death regulators conserved in prokaryotes and eukaryotes. However, no research has focused on the evolution of this superfamily in plants. We identified 685 TMBIM proteins in 171 organisms from Archaea, Bacteria, and Eukarya, and provided a phylogenetic overview of the whole TMBIM superfamily. Then, we used orthology and synteny network analyses to further investigate the evolution and expansion of the BI and LFG families in 48 plants from diverse taxa. Plant BI family forms a single monophyletic group; however, monocot BI sequences transposed to another genomic context during evolution. Plant LFG family, which expanded trough whole genome and tandem duplications, is subdivided in LFG I, LFG IIA, and LFG IIB major phylogenetic groups, and retains synteny in angiosperms. Moreover, two orthologous groups (OGs) are shared between bryophytes and seed plants. Other several lineage-specific OGs are present in plants. This work clarifies the phylogenetic classification of the TMBIM superfamily across the three domains of life. Furthermore, it sheds new light on the evolution of the BI and LFG families in plants providing a benchmark for future research.
Subject(s)
Evolution, Molecular , Genomics , Multigene Family , Phylogeny , Plant Proteins/genetics , Plants/genetics , Synteny/genetics , Amino Acid Motifs , Amino Acid Sequence , Archaea/metabolism , Bacteria/metabolism , Bryophyta/metabolism , Calcium Channels/metabolism , Conserved Sequence/genetics , Eukaryota/metabolism , Hydrogen-Ion Concentration , Plant Proteins/chemistryABSTRACT
Faecalibacterium prausnitzii is a commensal bacterium, ubiquitous in the gastrointestinal tracts of animals and humans. This species is a functionally important member of the microbiota and studies suggest it has an impact on the physiology and health of the host. F. prausnitzii is the only identified species in the genus Faecalibacterium, but a recent study clustered strains of this species in two different phylogroups. Here, we propose the existence of distinct species in this genus through the use of comparative genomics. Briefly, we performed analyses of 16S rRNA gene phylogeny, phylogenomics, whole genome Multi-Locus Sequence Typing (wgMLST), Average Nucleotide Identity (ANI), gene synteny, and pangenome to better elucidate the phylogenetic relationships among strains of Faecalibacterium. For this, we used 12 newly sequenced, assembled, and curated genomes of F. prausnitzii, which were isolated from feces of healthy volunteers from France and Australia, and combined these with published data from 5 strains downloaded from public databases. The phylogenetic analysis of the 16S rRNA sequences, together with the wgMLST profiles and a phylogenomic tree based on comparisons of genome similarity, all supported the clustering of Faecalibacterium strains in different genospecies. Additionally, the global analysis of gene synteny among all strains showed a highly fragmented profile, whereas the intra-cluster analyses revealed larger and more conserved collinear blocks. Finally, ANI analysis substantiated the presence of three distinct clusters-A, B, and C-composed of five, four, and four strains, respectively. The pangenome analysis of each cluster corroborated the classification of these clusters into three distinct species, each containing less variability than that found within the global pangenome of all strains. Here, we propose that comparison of pangenome subsets and their associated α values may be used as an alternative approach, together with ANI, in the in silico classification of new species. Altogether, our results provide evidence not only for the reconsideration of the phylogenetic and genomic relatedness among strains currently assigned to F. prausnitzii, but also the need for lineage (strain-based) differentiation of this taxon to better define how specific members might be associated with positive or negative host interactions.
ABSTRACT
BACKGROUND: Citrus breeding programs have many limitations associated with the species biology and physiology, requiring the incorporation of new biotechnological tools to provide new breeding possibilities. Diversity Arrays Technology (DArT) markers, combined with next-generation sequencing, have wide applicability in the construction of high-resolution genetic maps and in quantitative trait locus (QTL) mapping. This study aimed to construct an integrated genetic map using full-sib progeny derived from Murcott tangor and Pera sweet orange and DArTseq™ molecular markers and to perform QTL mapping of twelve fruit quality traits. A controlled Murcott x Pera crossing was conducted at the Citrus Germplasm Repository at the Sylvio Moreira Citrus Centre of the Agronomic Institute (IAC) located in Cordeirópolis, SP, in 1997. In 2012, 278 F1 individuals out of a family of 312 confirmed hybrid individuals were analyzed for fruit traits and genotyped using the DArTseq markers. Using OneMap software to obtain the integrated genetic map, we considered only the DArT loci that showed no segregation deviation. The likelihood ratio and the genomic information from the available Citrus sinensis L. Osbeck genome were used to determine the linkage groups (LGs). RESULTS: The resulting integrated map contained 661 markers in 13 LGs, with a genomic coverage of 2,774 cM and a mean density of 0.23 markers/cM. The groups were assigned to the nine Citrus haploid chromosomes; however, some of the chromosomes were represented by two LGs due the lack of information for a single integration, as in cases where markers segregated in a 3:1 fashion. A total of 19 QTLs were identified through composite interval mapping (CIM) of the 12 analyzed fruit characteristics: fruit diameter (cm), height (cm), height/diameter ratio, weight (g), rind thickness (cm), segments per fruit, total soluble solids (TSS, %), total titratable acidity (TTA, %), juice content (%), number of seeds, TSS/TTA ratio and number of fruits per box. The genomic sequence (pseudochromosomes) of C. sinensis was compared to the genetic map, and synteny was clearly identified. Further analysis of the map regions with the highest LOD scores enabled the identification of putative genes that could be associated with the fruit quality characteristics. CONCLUSION: An integrated linkage map of Murcott tangor and Pera sweet orange using DArTseq™ molecular markers was established and it was useful to perform QTL mapping of twelve fruit quality traits. The next generation sequences data allowed the comparison between the linkage map and the genomic sequence (pseudochromosomes) of C. sinensis and the identification of genes that may be responsible for phenotypic traits in Citrus. The obtained linkage map was used to assign sequences that had not been previously assigned to a position in the reference genome.
Subject(s)
Chromosome Mapping/methods , Citrus/genetics , Genetic Markers , Quantitative Trait Loci , Chromosomes, Plant/genetics , Citrus/classification , Fruit/genetics , High-Throughput Nucleotide Sequencing/methods , Lod Score , Phenotype , Plant Breeding , Sequence Analysis, DNA/methods , Software , SyntenyABSTRACT
BACKGROUND: The absence of Argonaute genes in the fungal pathogen Cryptococcus gattii R265 and other VGII strains indicates that yeasts of this genotype cannot have a functional RNAi pathway, an evolutionarily conserved gene silencing mechanism performed by small RNAs. The success of the R265 strain as a pathogen that caused the Pacific Northwest and Vancouver Island outbreaks may imply that RNAi machinery loss could be beneficial under certain circumstances during evolution. As a result, a hypermutant phenotype would be created with high rates of genome retrotransposition, for instance. This study therefore aimed to evaluate in silicio the effect of retrotransposons and their control mechanisms by small RNAs on genomic stability and synteny loss of C. gattii R265 through retrotransposons sequence comparison and orthology analysis with other 16 C. gattii genomic sequences available. RESULTS: Retrotransposon mining identified a higher sequence count to VGI genotype compared to VGII, VGIII, and VGIV. However, despite the lower retrotransposon number, VGII exhibited increased synteny loss and genome rearrangement events. RNA-Seq analysis indicated highly expressed retrotransposons as well as sRNA production. CONCLUSIONS: Genome rearrangement and synteny loss may suggest a greater retrotransposon mobilization caused by RNAi pathway absence, but the effective presence of sRNAs that matches retrotransposon sequences means that an alternative retrotransposon silencing mechanism could be active in genomic integrity maintenance of C. gattii VGII strains.
Subject(s)
Cryptococcus gattii/genetics , RNA, Small Interfering/genetics , Retroelements , Sequence Analysis, RNA/methods , Biological Evolution , Computer Simulation , Genotype , Phylogeny , RNA, Fungal/genetics , Sequence Deletion , SyntenyABSTRACT
BACKGROUND: Transfer RNAs (tRNAs) are ubiquitous in all living organism. They implement the genetic code so that most genomes contain distinct tRNAs for almost all 61 codons. They behave similar to mobile elements and proliferate in genomes spawning both local and non-local copies. Most tRNA families are therefore typically present as multicopy genes. The members of the individual tRNA families evolve under concerted or rapid birth-death evolution, so that paralogous copies maintain almost identical sequences over long evolutionary time-scales. To a good approximation these are functionally equivalent. Individual tRNA copies thus are evolutionary unstable and easily turn into pseudogenes and disappear. This leads to a rapid turnover of tRNAs and often large differences in the tRNA complements of closely related species. Since tRNA paralogs are not distinguished by sequence, common methods cannot not be used to establish orthology between tRNA genes. RESULTS: In this contribution we introduce a general framework to distinguish orthologs and paralogs in gene families that are subject to concerted evolution. It is based on the use of uniquely aligned adjacent sequence elements as anchors to establish syntenic conservation of sequence intervals. In practice, anchors and intervals can be extracted from genome-wide multiple sequence alignments. Syntenic clusters of concertedly evolving genes of different families can then be subdivided by list alignments, leading to usually small clusters of candidate co-orthologs. On the basis of recent advances in phylogenetic combinatorics, these candidate clusters can be further processed by cograph editing to recover their duplication histories. We developed a workflow that can be conceptualized as stepwise refinement of a graph of homologous genes. We apply this analysis strategy with different types of synteny anchors to investigate the evolution of tRNAs in primates and fruit flies. We identified a large number of tRNA remolding events concentrated at the tips of the phylogeny. With one notable exception all phylogenetically old tRNA remoldings do not change the isoacceptor class. CONCLUSIONS: Gene families evolving under concerted evolution are not amenable to classical phylogenetic analyses since paralogs maintain identical, species-specific sequences, precluding the estimation of correct gene trees from sequence differences. This leaves conservation of syntenic arrangements with respect to "anchor elements" that are not subject to concerted evolution as the only viable source of phylogenetic information. We have demonstrated here that a purely synteny-based analysis of tRNA gene histories is indeed feasible. Although the choice of synteny anchors influences the resolution in particular when tight gene clusters are present, and the quality of sequence alignments, genome assemblies, and genome rearrangements limits the scope of the analysis, largely coherent results can be obtained for tRNAs. In particular, we conclude that a large fraction of the tRNAs are recent copies. This proliferation is compensated by rapid pseudogenization as exemplified by many very recent alloacceptor remoldings.
Subject(s)
Drosophila/genetics , Genome , Phylogeny , Primates/genetics , RNA, Transfer/genetics , Synteny , Animals , Base Sequence , Codon , Evolution, Molecular , Genetic Code , Multigene Family , Pseudogenes , Sequence Alignment , Sequence Homology, Nucleic AcidABSTRACT
Sequencing plant genomes are often challenging because of their complex architecture and high content of repetitive sequences. Sugarcane has one of the most complex genomes. It is highly polyploid, preserves intact homeologous chromosomes from its parental species and contains >55% repetitive sequences. Although bacterial artificial chromosome (BAC) libraries have emerged as an alternative for accessing the sugarcane genome, sequencing individual clones is laborious and expensive. Here, we present a strategy for sequencing and assembly reads produced from the DNA of pooled BAC clones. A set of 178 BAC clones, randomly sampled from the SP80-3280 sugarcane BAC library, was pooled and sequenced using the Illumina HiSeq2000 and PacBio platforms. A hybrid assembly strategy was used to generate 2,451 scaffolds comprising 19.2 MB of assembled genome sequence. Scaffolds of ≥20 Kb corresponded to 80% of the assembled sequences, and the full sequences of forty BACs were recovered in one or two contigs. Alignment of the BAC scaffolds with the chromosome sequences of sorghum showed a high degree of collinearity and gene order. The alignment of the BAC scaffolds to the 10 sorghum chromosomes suggests that the genome of the SP80-3280 sugarcane variety is â¼19% contracted in relation to the sorghum genome. In conclusion, our data show that sequencing pools composed of high numbers of BAC clones may help to construct a reference scaffold map of the sugarcane genome.
ABSTRACT
Cytokine production for immunological process is tightly regulated at the transcriptional and posttranscriptional levels. The NF-κB signaling pathway maintains immune homeostasis in the cell through the participation of molecules such as A20 (TNFAIP3), which is a key regulatory factor in the immune response, hematopoietic differentiation, and immunomodulation. Although A20 has been identified in mammals, and despite recent efforts to identify A20 members in other higher vertebrates, relatively little is known about the composition of this regulator in other classes of vertebrates, particularly for bovines. In this study, the genetic context of bovine A20 was explored and compared against homologous genes in the human, mouse, chicken, dog, and zebrafish chromosomes. Through in silico analysis, several regions of interest were found conserved between even phylogenetically distant species. Additionally, a protein-deduced sequence of bovine A20 evidenced many conserved domains in humans and mice. Furthermore, all potential amino acid residues implicated in the active site of A20 were conserved. Finally, bovine A20 mRNA expression as mediated by the bovine viral diarrhea virus and poly (I:C) was evaluated. These analyses evidenced a strong fold increase in A20 expression following virus exposure, a phenomenon blocked by a pharmacological NF-κB inhibitor (BAY 117085). Interestingly, A20 mRNA had a half-life of only 32min, likely due to adenylate- and uridylate-rich elements in the 3'-untranslated region. Collectively, these data identify bovine A20 as a regulator of immune marker expression. Finally, this is the first report to find the bovine viral diarrhea virus modulating bovine A20 activation through the NF-κB pathway.