Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 10 de 10
Filter
Add more filters










Publication year range
1.
Entropy (Basel) ; 26(3)2024 Mar 11.
Article in English | MEDLINE | ID: mdl-38539760

ABSTRACT

We commonly encounter the problem of identifying an optimally weight-adjusted version of the empirical distribution of observed data, adhering to predefined constraints on the weights. Such constraints often manifest as restrictions on the moments, tail behavior, shapes, number of modes, etc., of the resulting weight-adjusted empirical distribution. In this article, we substantially enhance the flexibility of such a methodology by introducing a nonparametrically imbued distributional constraint on the weights and developing a general framework leveraging the maximum entropy principle and tools from optimal transport. The key idea is to ensure that the maximum entropy weight-adjusted empirical distribution of the observed data is close to a pre-specified probability distribution in terms of the optimal transport metric, while allowing for subtle departures. The proposed scheme for the re-weighting of observations subject to constraints is reminiscent of the empirical likelihood and related ideas, but offers greater flexibility in applications where parametric distribution-guided constraints arise naturally. The versatility of the proposed framework is demonstrated in the context of three disparate applications where data re-weighting is warranted to satisfy side constraints on the optimization problem at the heart of the statistical task-namely, portfolio allocation, semi-parametric inference for complex surveys, and ensuring algorithmic fairness in machine learning algorithms.

2.
Sci Rep ; 14(1): 2799, 2024 02 02.
Article in English | MEDLINE | ID: mdl-38307917

ABSTRACT

Tinospora cordifolia (Willd.) Hook.f. & Thomson, also known as Giloy, is among the most important medicinal plants that have numerous therapeutic applications in human health due to the production of a diverse array of secondary metabolites. To gain genomic insights into the medicinal properties of T. cordifolia, the genome sequencing was carried out using 10× Genomics linked read and Nanopore long-read technologies. The draft genome assembly of T. cordifolia was comprised of 1.01 Gbp, which is the genome sequenced from the plant family Menispermaceae. We also performed the genome size estimation for T. cordifolia, which was found to be 1.13 Gbp. The deep sequencing of transcriptome from the leaf tissue was also performed. The genome and transcriptome assemblies were used to construct the gene set, resulting in 17,245 coding gene sequences. Further, the phylogenetic position of T. cordifolia was also positioned as basal eudicot by constructing a genome-wide phylogenetic tree using multiple species. Further, a comprehensive comparative evolutionary analysis of gene families contraction/expansion and multiple signatures of adaptive evolution was performed. The genes involved in benzyl iso-quinoline alkaloid, terpenoid, lignin and flavonoid biosynthesis pathways were found with signatures of adaptive evolution. These evolutionary adaptations in genes provide genomic insights into the presence of diverse medicinal properties of this plant. The genes involved in the common symbiosis signalling pathway associated with endosymbiosis (Arbuscular Mycorrhiza) were found to be adaptively evolved. The genes involved in adventitious root formation, peroxisome biogenesis, biosynthesis of phytohormones, and tolerance against abiotic and biotic stresses were also found to be adaptively evolved in T. cordifolia.


Subject(s)
Alkaloids , Plants, Medicinal , Tinospora , Humans , Plants, Medicinal/genetics , Tinospora/genetics , Tinospora/metabolism , Phylogeny , Plant Extracts/metabolism , Alkaloids/metabolism
3.
Entropy (Basel) ; 26(1)2024 Jan 11.
Article in English | MEDLINE | ID: mdl-38248188

ABSTRACT

The rise of machine learning-driven decision-making has sparked a growing emphasis on algorithmic fairness. Within the realm of clustering, the notion of balance is utilized as a criterion for attaining fairness, which characterizes a clustering mechanism as fair when the resulting clusters maintain a consistent proportion of observations representing individuals from distinct groups delineated by protected attributes. Building on this idea, the literature has rapidly incorporated a myriad of extensions, devising fair versions of the existing frequentist clustering algorithms, e.g., k-means, k-medioids, etc., that aim at minimizing specific loss functions. These approaches lack uncertainty quantification associated with the optimal clustering configuration and only provide clustering boundaries without quantifying the probabilities associated with each observation belonging to the different clusters. In this article, we intend to offer a novel probabilistic formulation of the fair clustering problem that facilitates valid uncertainty quantification even under mild model misspecifications, without incurring substantial computational overhead. Mixture model-based fair clustering frameworks facilitate automatic uncertainty quantification, but tend to showcase brittleness under model misspecification and involve significant computational challenges. To circumnavigate such issues, we propose a generalized Bayesian fair clustering framework that inherently enjoys decision-theoretic interpretation. Moreover, we devise efficient computational algorithms that crucially leverage techniques from the existing literature on optimal transport and clustering based on loss functions. The gain from the proposed technology is showcased via numerical experiments and real data examples.

4.
Front Plant Sci ; 14: 1260414, 2023.
Article in English | MEDLINE | ID: mdl-38046611

ABSTRACT

Syzygium cumini, also known as jambolan or jamun, is an evergreen tree widely known for its medicinal properties, fruits, and ornamental value. To understand the genomic and evolutionary basis of its medicinal properties, we sequenced S. cumini genome for the first time from the world's largest tree genus Syzygium using Oxford Nanopore and 10x Genomics sequencing technologies. We also sequenced and assembled the transcriptome of S. cumini in this study. The tetraploid and highly heterozygous draft genome of S. cumini had a total size of 709.9 Mbp with 61,195 coding genes. The phylogenetic position of S. cumini was established using a comprehensive genome-wide analysis including species from 18 Eudicot plant orders. The existence of neopolyploidy in S. cumini was evident from the higher number of coding genes and expanded gene families resulting from gene duplication events compared to the other two sequenced species from this genus. Comparative evolutionary analyses showed the adaptive evolution of genes involved in the phenylpropanoid-flavonoid (PF) biosynthesis pathway and other secondary metabolites biosynthesis such as terpenoid and alkaloid in S. cumini, along with genes involved in stress tolerance mechanisms, which was also supported by leaf transcriptome data generated in this study. The adaptive evolution of secondary metabolism pathways is associated with the wide range of pharmacological properties, specifically the anti-diabetic property, of this species conferred by the bioactive compounds that act as nutraceutical agents in modern medicine.

5.
Front Plant Sci ; 14: 1210078, 2023.
Article in English | MEDLINE | ID: mdl-37727852

ABSTRACT

Phyllanthus emblica or Indian gooseberry, commonly known as amla, is an important medicinal horticultural plant used in traditional and modern medicines. It bears stone fruits with immense antioxidant properties due to being one of the richest natural sources of vitamin C and numerous flavonoids. This study presents the first genome sequencing of this species performed using 10x Genomics and Oxford Nanopore Technology. The draft genome assembly was 519 Mbp in size and consisted of 4,384 contigs, N50 of 597 Kbp, 98.4% BUSCO score, and 37,858 coding sequences. This study also reports the genome-wide phylogeny of this species with 26 other plant species that resolved the phylogenetic position of P. emblica. The presence of three ascorbate biosynthesis pathways including L-galactose, galacturonate, and myo-inositol pathways was confirmed in this genome. A comprehensive comparative evolutionary genomic analysis including gene family expansion/contraction and identification of multiple signatures of adaptive evolution provided evolutionary insights into ascorbate and flavonoid biosynthesis pathways and stone fruit formation through lignin biosynthesis. The availability of this genome will be beneficial for its horticultural, medicinal, dietary, and cosmetic applications and will also help in comparative genomics analysis studies.

6.
Heliyon ; 9(8): e18571, 2023 Aug.
Article in English | MEDLINE | ID: mdl-37576271

ABSTRACT

An intriguing example of differential adaptability is the case of two Asian peafowl species, Pavo cristatus (blue peafowl) and Pavo muticus (green peafowl), where the former has a "Least Concern" conservation status and the latter is an "Endangered" species. To understand the genetic basis of this differential adaptability of the two peafowl species, a comparative analysis of these species is much needed to gain the genomic and evolutionary insights. Thus, we constructed a high-quality genome assembly of blue peafowl with an N50 value of 84.81 Mb (pseudochromosome-level assembly), and a high-confidence coding gene set to perform the genomic and evolutionary analyses of blue and green peafowls with 49 other avian species. The analyses revealed adaptive evolution of genes related to neuronal development, immunity, and skeletal muscle development in these peafowl species. Major genes related to axon guidance such as NEO1 and UNC5, semaphorin (SEMA), and ephrin receptor showed adaptive evolution in peafowl species. However, blue peafowl showed the presence of 42% more coding genes compared to the green peafowl along with a higher number of species-specific gene clusters, segmental duplicated genes and expanded gene families, and comparatively higher evolution in neuronal and developmental pathways. Blue peafowl also showed longer branch length compared to green peafowl in the species phylogenetic tree. These genomic insights obtained from the high-quality genome assembly of P. cristatus constructed in this study provide new clues on the superior adaptability of the blue peafowl over green peafowl despite having a recent species divergence time.

7.
Genes Genomics ; 45(11): 1399-1408, 2023 Nov.
Article in English | MEDLINE | ID: mdl-37231295

ABSTRACT

BACKGROUND: Indian cattle breeds (Bos indicus) are known for their remarkable adaptability to hot and humid climates, higher nutritious quality of milk, better disease tolerance, and greater ability to perform in poor feed compared to taurine cattle (Bos taurus). Distinct phenotypic differences are observed among the B. indicus breeds; however, the whole genome sequences were unavailable for these indigenous breeds. OBJECTIVE: We aimed to perform whole genome sequencing to construct the draft genome assemblies of four B. indicus breeds; Ongole, Kasargod Dwarf, Kasargod Kapila, and Vechur (the smallest cattle of the world). METHODS: We sequenced the whole genomes using Illumina short-read technology, and constructed de novo and reference-based genome assemblies of these native B. indicus breeds for the first time. RESULTS: The draft de novo genome assemblies of B. indicus breeds ranged from 1.98 to 3.42 Gbp. We also constructed the mitochondrial genome assemblies (~ 16.3 Kbp), and yet unavailable 18S rRNA marker gene sequences of these B. indicus breeds. The genome assemblies helped to identify the bovine genes related to distinct phenotypic characteristics and other biological processes for this species compared to B. taurus, which are plausibly responsible for providing better adaptive traits. We also identified the genes that showed sequence variation in dwarf and non-dwarf breeds of B. indicus compared to B. taurus. CONCLUSIONS: The genome assemblies of these Indian cattle breeds, the 18S rRNA marker genes, and identification of the distinct genes in B. indicus breeds compared to B. taurus will help in future studies on these cattle species.

8.
iScience ; 25(10): 105100, 2022 Oct 21.
Article in English | MEDLINE | ID: mdl-36164650

ABSTRACT

Ficus benghalensis and Ficus religiosa are large woody trees well known for their long lifespan, ecological and traditional significance, and medicinal properties. To understand the genomic and evolutionary aspects of these characteristics, the whole genomes of these Ficus species were sequenced using 10x Genomics linked reads and Oxford Nanopore long reads. The draft genomes of F. benghalensis and F. religiosa comprised of 392.89 Mbp and 332.97 Mbp, respectively. We established the genome-wide phylogenetic positions of the two Ficus species with respect to 50 other Angiosperm species. Comparative evolutionary analyses with other phylogenetically closer Eudicot species revealed adaptive evolution in genes involved in key cellular mechanisms associated with prolonged survival including phytohormones signaling, senescence, disease resistance, and abiotic stress tolerance, which provide genomic insights into the mechanisms conferring longevity and suggest that longevity is a multifaceted phenomenon. This study also provides clues on the existence of CAM pathway in these Ficus species.

9.
Commun Biol ; 4(1): 1193, 2021 10 15.
Article in English | MEDLINE | ID: mdl-34654884

ABSTRACT

Curcuma longa, or turmeric, is traditionally known for its immense medicinal properties and has diverse therapeutic applications. However, the absence of a reference genome sequence is a limiting factor in understanding the genomic basis of the origin of its medicinal properties. In this study, we present the draft genome sequence of C. longa, belonging to Zingiberaceae plant family, constructed using 10x Genomics linked reads and Oxford Nanopore long reads. For comprehensive gene set prediction and for insights into its gene expression, transcriptome sequencing of leaf tissue was also performed. The draft genome assembly had a size of 1.02 Gbp with ~70% repetitive sequences, and contained 50,401 coding gene sequences. The phylogenetic position of C. longa was resolved through a comprehensive genome-wide analysis including 16 other plant species. Using 5,388 orthogroups, the comparative evolutionary analysis performed across 17 species including C. longa revealed evolution in genes associated with secondary metabolism, plant phytohormones signaling, and various biotic and abiotic stress tolerance responses. These mechanisms are crucial for perennial and rhizomatous plants such as C. longa for defense and environmental stress tolerance via production of secondary metabolites, which are associated with the wide range of medicinal properties in C. longa.


Subject(s)
Chromosome Mapping , Curcuma/genetics , Plants, Medicinal/genetics , Base Sequence , Curcuma/chemistry , Plant Extracts/chemistry , Repetitive Sequences, Nucleic Acid
10.
iScience ; 24(2): 102079, 2021 Feb 19.
Article in English | MEDLINE | ID: mdl-33644713

ABSTRACT

Aloe vera is a species from Asphodelaceae family having characteristics like drought resistance and numerous medicinal properties. However, the genetic basis of these phenotypes is yet unknown primarily due to unavailability of its genome sequence. Thus, we report the first Aloe vera genome sequence comprising of 12.93 Gbp and harboring 86,177 protein-coding genes. It is the first genome from Asphodelaceae family and the largest angiosperm genome sequenced and assembled till date. We also report the first genome-wide phylogeny of monocots including Aloe vera to resolve its phylogenetic position. The comprehensive comparative analysis of Aloe vera with other available high-quality monocot genomes revealed adaptive evolution in several genes of drought stress response, CAM pathway, and circadian rhythm and positive selection in DNA damage response genes in Aloe vera. This study provides clues on the genetic basis of evolution of drought stress tolerance capabilities of Aloe vera.

SELECTION OF CITATIONS
SEARCH DETAIL
...