Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 8 de 8
Filter
Add more filters










Database
Language
Publication year range
1.
Brief Bioinform ; 24(2)2023 03 19.
Article in English | MEDLINE | ID: mdl-36857617

ABSTRACT

Advances in spatial transcriptomics enlarge the use of single cell technologies to unveil the expression landscape of the tissues with valuable spatial context. Here, we propose an unsupervised and manifold learning-based algorithm, Spatial Transcriptome based cEll typE cLustering (STEEL), which identifies domains from spatial transcriptome by clustering beads exhibiting both highly similar gene expression profiles and close spatial distance in the manner of graphs. Comprehensive evaluation of STEEL on spatial transcriptomic datasets from 10X Visium platform demonstrates that it not only achieves a high resolution to characterize fine structures of mouse brain but also enables the integration of multiple tissue slides individually analyzed into a larger one. STEEL outperforms previous methods to effectively distinguish different cell types/domains of various tissues on Slide-seq datasets, featuring in higher bead density but lower transcript detection efficiency. Application of STEEL on spatial transcriptomes of early-stage mouse embryos (E9.5-E12.5) successfully delineates a progressive development landscape of tissues from ectoderm, mesoderm and endoderm layers, and further profiles dynamic changes on cell differentiation in heart and other organs. With the advancement of spatial transcriptome technologies, our method will have great applicability on domain identification and gene expression atlas reconstruction.


Subject(s)
Steel , Transcriptome , Animals , Mice , Gene Expression Profiling/methods , Cell Differentiation , Algorithms
2.
Plant Commun ; 4(1): 100422, 2023 01 09.
Article in English | MEDLINE | ID: mdl-35957520

ABSTRACT

Fabaceae is a large family of angiosperms with high biodiversity that contains a variety of economically important crops and model plants for the study of biological nitrogen fixation. Polyploidization events have been extensively studied in some Fabaceae plants, but the occurrence of new genes is still concealed, owing to a lack of genomic information on certain species of the basal clade of Fabaceae. Cercis chinensis (Cercidoideae) is one such species; it diverged earliest from Fabaceae and is essential for phylogenomic studies and new gene predictions in Fabaceae. To facilitate genomic studies on Fabaceae, we performed genome sequencing of C. chinensis and obtained a 352.84 Mb genome, which was further assembled into seven pseudochromosomes with 30 612 predicted protein-coding genes. Compared with other legume genomes, that of C. chinensis exhibits no lineage-specific polyploidization event. Further phylogenomic analyses of 22 legumes and 11 other angiosperms revealed that many gene families are lineage specific before and after the diversification of Fabaceae. Among them, dozens of genes are candidates for new genes that have evolved from intergenic regions and are thus regarded as de novo-originated genes. They differ significantly from established genes in coding sequence length, exon number, guanine-cytosine content, and expression patterns among tissues. Functional analysis revealed that many new genes are related to asparagine metabolism. This study represents an important advance in understanding the evolutionary pattern of new genes in legumes and provides a valuable resource for plant phylogenomic studies.


Subject(s)
Fabaceae , Fabaceae/genetics , Phylogeny , Chromosome Mapping , Base Sequence
3.
Bioinformatics ; 38(23): 5317-5321, 2022 11 30.
Article in English | MEDLINE | ID: mdl-36218394

ABSTRACT

MOTIVATION: Whole-genome duplication events have long been discovered throughout the evolution of eukaryotes, contributing to genome complexity and biodiversity and leaving traces in the descending organisms. Therefore, an accurate and rapid phylogenomic method is needed to identify the retained duplicated genes on various lineages across the target taxonomy. RESULTS: Here, we present Tree2GD, an integrated method to identify large-scale gene duplication events by automatically perform multiple procedures, including sequence alignment, recognition of homolog, gene tree/species tree reconciliation, Ks distribution of gene duplicates and synteny analyses. Application of Tree2GD on 2 datasets, 12 metazoan genomes and 68 angiosperms, successfully identifies all reported whole-genome duplication events exhibited by these species, showing effectiveness and efficiency of Tree2GD on phylogenomic analyses of large-scale gene duplications. AVAILABILITY AND IMPLEMENTATION: Tree2GD is written in Python and C++ and is available at https://github.com/Dee-chen/Tree2gd. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Eukaryota , Gene Duplication , Animals , Phylogeny , Synteny , Sequence Alignment
4.
Nucleic Acids Res ; 50(17): 9724-9737, 2022 09 23.
Article in English | MEDLINE | ID: mdl-36095130

ABSTRACT

Development of floral organs exhibits complex molecular mechanisms involving the co-regulation of many genes specialized and precisely functioning in various tissues and developing stages. Advance in spatial transcriptome technologies allows for quantitative measurement of spatially localized gene abundance making it possible to bridge complex scenario of flower organogenesis with genome-wide molecular phenotypes. Here, we apply the 10× Visium technology in the study of the formation of floral organs through development in an orchid plant, Phalaenopsis Big Chili. Cell-types of early floral development including inflorescence meristems, primordia of floral organs and identity determined tissues, are recognized based on spatial expression distribution of thousands of genes in high resolution. In addition, meristematic cells on the basal position of floral organs are found to continuously function in multiple developmental stages after organ initiation. Particularly, the development of anther, which primordium starts from a single spot to multiple differentiated cell-types in later stages including pollinium and other vegetative tissues, is revealed by well-known MADS-box genes and many other downstream regulators. The spatial transcriptome analyses provide comprehensive information of gene activity for understanding the molecular architecture of flower organogenesis and for future genomic and genetic studies of specific cell-types.


Subject(s)
MADS Domain Proteins , Orchidaceae , Flowers , Gene Expression Regulation, Plant , MADS Domain Proteins/genetics , Meristem/genetics , Meristem/metabolism , Orchidaceae/genetics , Plant Proteins/genetics
5.
Genomics Proteomics Bioinformatics ; 20(3): 524-535, 2022 06.
Article in English | MEDLINE | ID: mdl-33711466

ABSTRACT

Accurately identifying DNA polymorphisms can bridge the gap between phenotypes and genotypes and is essential for molecular marker assisted genetic studies. Genome complexities, including large-scale structural variations, bring great challenges to bioinformatic analysis for obtaining high-confidence genomic variants, as sequence differences between non-allelic loci of two or more genomes can be misinterpreted as polymorphisms. It is important to correctly filter out artificial variants to avoid false genotyping or estimation of allele frequencies. Here, we present an efficient and effective framework, inGAP-family, to discover, filter, and visualize DNA polymorphisms and structural variants (SVs) from alignment of short reads. Applying this method to polymorphism detection on real datasets shows that elimination of artificial variants greatly facilitates the precise identification of meiotic recombination points as well as causal mutations in mutant genomes or quantitative trait loci. In addition, inGAP-family provides a user-friendly graphical interface for detecting polymorphisms and SVs, further evaluating predicted variants and identifying mutations related to genotypes. It is accessible at https://sourceforge.net/projects/ingap-family/.


Subject(s)
Polymorphism, Genetic , Quantitative Trait Loci , Gene Frequency , Mutation , Genotype , Sequence Analysis, DNA , Polymorphism, Single Nucleotide
6.
Nucleic Acids Res ; 47(5): e30, 2019 03 18.
Article in English | MEDLINE | ID: mdl-30657979

ABSTRACT

Metagenomic studies, greatly promoted by the fast development of next-generation sequencing (NGS) technologies, uncover complex structures of microbial communities and their interactions with environment. As the majority of microbes lack information of genome sequences, it is essential to assemble prokaryotic genomes ab initio aiming to retrieve complete coding genes from various metabolic pathways. The complex nature of microbial composition and the burden of handling a vast amount of metagenomic data, bring great challenges to the development of effective and efficient bioinformatic tools. Here we present a protein assembler (MetaPA), based on de Bruijn graph searching on oligopeptide spaces and can be applied on both metagenomic and metatranscriptomic sequencing data. When public homologous protein sequences are involved to guide the assembling procedures, MetaPA assembles 85% of total proteins in complete sequences with high precision of 83% on real high-throughput sequencing datasets. Application of MetaPA on metatranscriptomic data successfully identifies the majority of actively transcribed genes validated in related studies. The results suggest that MetaPA has a good potential in both metagenomic and metatranscriptomic studies to characterize the composition and abundance of microbiota.


Subject(s)
Algorithms , Amino Acids/genetics , Genes/genetics , High-Throughput Nucleotide Sequencing/methods , Metagenomics/methods , Transcriptome/genetics , Datasets as Topic , Feces/microbiology , Gastrointestinal Microbiome/genetics , Humans , Microbiota/genetics
7.
Mol Plant ; 11(3): 414-428, 2018 03 05.
Article in English | MEDLINE | ID: mdl-29317285

ABSTRACT

Gene duplications provide evolutionary potentials for generating novel functions, while polyploidization or whole genome duplication (WGD) doubles the chromosomes initially and results in hundreds to thousands of retained duplicates. WGDs are strongly supported by evidence commonly found in many species-rich lineages of eukaryotes, and thus are considered as a major driving force in species diversification. We performed comparative genomic and phylogenomic analyses of 59 public genomes/transcriptomes and 46 newly sequenced transcriptomes covering major lineages of angiosperms to detect large-scale gene duplication events by surveying tens of thousands of gene family trees. These analyses confirmed most of the previously reported WGDs and provided strong evidence for novel ones in many lineages. The detected WGDs supported a model of exponential gene loss during evolution with an estimated half-life of approximately 21.6 million years, and were correlated with both the emergence of lineages with high degrees of diversification and periods of global climate changes. The new datasets and analyses detected many novel WGDs widely spread during angiosperm evolution, uncovered preferential retention of gene functions in essential cellular metabolisms, and provided clues for the roles of WGD in promoting angiosperm radiation and enhancing their adaptation to environmental changes.


Subject(s)
Gene Duplication/genetics , Genome, Plant/genetics , Magnoliopsida/genetics , Evolution, Molecular , Phylogeny , Polyploidy
8.
Proc Natl Acad Sci U S A ; 111(27): 10007-12, 2014 Jul 08.
Article in English | MEDLINE | ID: mdl-24958856

ABSTRACT

DNA polymorphisms are important markers in genetic analyses and are increasingly detected by using genome resequencing. However, the presence of repetitive sequences and structural variants can lead to false positives in the identification of polymorphic alleles. Here, we describe an analysis strategy that minimizes false positives in allelic detection and present analyses of recently published resequencing data from Arabidopsis meiotic products and individual humans. Our analysis enables the accurate detection of sequencing errors, small insertions and deletions (indels), and structural variants, including large reciprocal indels and copy number variants, from comparisons between the resequenced and reference genomes. We offer an alternative interpretation of the sequencing data of meiotic products, including the number and type of recombination events, to illustrate the potential for mistakes in single-nucleotide polymorphism calling. Using these examples, we propose that the detection of DNA polymorphisms using resequencing data needs to account for nonallelic homologous sequences.


Subject(s)
Arabidopsis/genetics , Genome, Plant , Meiosis/genetics , Polymorphism, Single Nucleotide , Recombination, Genetic , Alleles , Base Sequence , DNA, Plant/genetics , Humans , Molecular Sequence Data
SELECTION OF CITATIONS
SEARCH DETAIL
...