Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 2.462
Filter
1.
Mol Biol Rep ; 51(1): 715, 2024 Jun 01.
Article in English | MEDLINE | ID: mdl-38824248

ABSTRACT

BACKGROUND: Camellia tachangensis F. C. Zhang is a five-compartment species in the ovary of tea group plants, which represents the original germline of early differentiation of some tea group plants. METHODS AND RESULTS: In this study, we analyzed single-nucleotide polymorphisms (SNPs) at the genome level, constructed a phylogenetic tree, analyzed the genetic diversity, and further investigated the population structure of 100 C. tachangensis accessions using the genotyping-by-sequencing (GBS) method. A total of 91,959 high-quality SNPs were obtained. Population structure analysis showed that the 100 C. tachangensis accessions clustered into three groups: YQ-1 (Village Group), YQ-2 (Forest Group) and YQ-3 (Transition Group), which was further consistent with the results of phylogenetic analysis and principal component analyses (PCA). In addition, a comparative analysis of the genetic diversity among the three populations (Forest, Village, and Transition Groups) detected the highest genetic diversity in the Transition Group and the highest differentiation between Forest and Village Groups. CONCLUSIONS: C. tachangensis plants growing in the forest had different genetic backgrounds from those growing in villages. This study provides a basis for the effective protection and utilization of C. tachangensis populations and lays a foundation for future C. tachangensis breeding.


Subject(s)
Camellia , Genetic Variation , Phylogeny , Polymorphism, Single Nucleotide , Camellia/genetics , Polymorphism, Single Nucleotide/genetics , China , Genetic Variation/genetics , Genetics, Population/methods , Genotype , Principal Component Analysis , Genome, Plant
2.
Genet Sel Evol ; 56(1): 34, 2024 May 02.
Article in English | MEDLINE | ID: mdl-38698373

ABSTRACT

Metafounders are a useful concept to characterize relationships within and across populations, and to help genetic evaluations because they help modelling the means and variances of unknown base population animals. Current definitions of metafounder relationships are sensitive to the choice of reference alleles and have not been compared to their counterparts in population genetics-namely, heterozygosities, FST coefficients, and genetic distances. We redefine the relationships across populations with an arbitrary base of a maximum heterozygosity population in Hardy-Weinberg equilibrium. Then, the relationship between or within populations is a cross-product of the form Γ b , b ' = 2 n 2 p b - 1 2 p b ' - 1 ' with p being vectors of allele frequencies at n markers in populations b and b ' . This is simply the genomic relationship of two pseudo-individuals whose genotypes are equal to twice the allele frequencies. We also show that this coding is invariant to the choice of reference alleles. In addition, standard population genetics metrics (inbreeding coefficients of various forms; FST differentiation coefficients; segregation variance; and Nei's genetic distance) can be obtained from elements of matrix Γ .


Subject(s)
Gene Frequency , Genetics, Population , Models, Genetic , Animals , Genetics, Population/methods , Heterozygote , Alleles , Genomics/methods , Genotype , Genome
3.
Genet Sel Evol ; 56(1): 38, 2024 May 15.
Article in English | MEDLINE | ID: mdl-38750427

ABSTRACT

BACKGROUND: The accuracy of genomic prediction is partly determined by the size of the reference population. In Atlantic salmon breeding programs, four parallel populations often exist, thus offering the opportunity to increase the size of the reference set by combining these populations. By allowing a reduction in the number of records per population, multi-population prediction can potentially reduce cost and welfare issues related to the recording of traits, particularly for diseases. In this study, we evaluated the accuracy of multi- and across-population prediction of breeding values for resistance to amoebic gill disease (AGD) using all single nucleotide polymorphisms (SNPs) on a 55K chip or a selected subset of SNPs based on the signs of allele substitution effect estimates across populations, using both linear and nonlinear genomic prediction (GP) models in Atlantic salmon populations. In addition, we investigated genetic distance, genetic correlation estimated based on genomic relationships, and persistency of linkage disequilibrium (LD) phase across these populations. RESULTS: The genetic distance between populations ranged from 0.03 to 0.07, while the genetic correlation ranged from 0.19 to 0.99. Nonetheless, compared to within-population prediction, there was limited or no impact of combining populations for multi-population prediction across the various models used or when using the selected subset of SNPs. The estimates of across-population prediction accuracy were low and to some extent proportional to the genetic correlation estimates. The persistency of LD phase between adjacent markers across populations using all SNP data ranged from 0.51 to 0.65, indicating that LD is poorly conserved across the studied populations. CONCLUSIONS: Our results show that a high genetic correlation and a high genetic relationship between populations do not guarantee a higher prediction accuracy from multi-population genomic prediction in Atlantic salmon.


Subject(s)
Linkage Disequilibrium , Polymorphism, Single Nucleotide , Salmo salar , Animals , Salmo salar/genetics , Genomics/methods , Fish Diseases/genetics , Genetics, Population/methods , Models, Genetic , Breeding/methods , Genome , Disease Resistance/genetics
4.
Mol Biol Evol ; 41(5)2024 May 03.
Article in English | MEDLINE | ID: mdl-38696269

ABSTRACT

This perspective article offers a meditation on FST and other quantities developed by Sewall Wright to describe the population structure, defined as any departure from reproduction through random union of gametes. Concepts related to the F-statistics draw from studies of the partitioning of variation, identity coefficients, and diversity measures. Relationships between the first two approaches have recently been clarified and unified. This essay addresses the third pillar of the discussion: Nei's GST and related measures. A hierarchy of probabilities of identity-by-state provides a description of the relationships among levels of a structured population with respect to genetic diversity. Explicit expressions for the identity-by-state probabilities are determined for models of structured populations undergoing regular inbreeding and recurrent mutation. Levels of genetic diversity within and between subpopulations reflect mutation as well as migration. Accordingly, indices of the population structure are inherently locus-specific, contrary to the intentions of Wright. Some implications of this locus-specificity are explored.


Subject(s)
Genetic Variation , Genetics, Population , Models, Genetic , Genetics, Population/methods , Mutation , Inbreeding
5.
Mol Ecol Resour ; 24(5): e13969, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38747336

ABSTRACT

A major aim of evolutionary biology is to understand why patterns of genomic diversity vary within taxa and space. Large-scale genomic studies of widespread species are useful for studying how environment and demography shape patterns of genomic divergence. Here, we describe one of the most geographically comprehensive surveys of genomic variation in a wild vertebrate to date; the great tit (Parus major) HapMap project. We screened ca 500,000 SNP markers across 647 individuals from 29 populations, spanning ~30 degrees of latitude and 40 degrees of longitude - almost the entire geographical range of the European subspecies. Genome-wide variation was consistent with a recent colonisation across Europe from a South-East European refugium, with bottlenecks and reduced genetic diversity in island populations. Differentiation across the genome was highly heterogeneous, with clear 'islands of differentiation', even among populations with very low levels of genome-wide differentiation. Low local recombination rates were a strong predictor of high local genomic differentiation (FST), especially in island and peripheral mainland populations, suggesting that the interplay between genetic drift and recombination causes highly heterogeneous differentiation landscapes. We also detected genomic outlier regions that were confined to one or more peripheral great tit populations, probably as a result of recent directional selection at the species' range edges. Haplotype-based measures of selection were related to recombination rate, albeit less strongly, and highlighted population-specific sweeps that likely resulted from positive selection. Our study highlights how comprehensive screens of genomic variation in wild organisms can provide unique insights into spatio-temporal evolutionary dynamics.


Subject(s)
Genetic Variation , Polymorphism, Single Nucleotide , Songbirds , Animals , Songbirds/genetics , Songbirds/classification , Genetics, Population/methods , Europe , Passeriformes/genetics , Passeriformes/classification , Haplotypes/genetics , Recombination, Genetic , Selection, Genetic
6.
Mol Biol Evol ; 41(5)2024 May 03.
Article in English | MEDLINE | ID: mdl-38743590

ABSTRACT

Studying range expansions is central for understanding genetic variation through space and time as well as for identifying refugia and biological invasions. Range expansions are characterized by serial founder events causing clines of decreasing genetic diversity away from the center of origin and asymmetries in the two-dimensional allele frequency spectra. These asymmetries, summarized by the directionality index (ψ), are sensitive to range expansions and persist for longer than clines in genetic diversity. In continuous and finite meta-populations, genetic drift tends to be stronger at the edges of the species distribution in equilibrium populations and populations undergoing range expansions alike. Such boundary effects are expected to affect geographic patterns in genetic diversity and ψ. Here we demonstrate that boundary effects cause high false positive rates in equilibrium meta-populations when testing for range expansions. In the simulations, the absolute value of ψ (|ψ|) in equilibrium data sets was proportional to the fixation index (FST). By fitting signatures of range expansions as a function of ɛ |ψ|/FST and geographic clines in ψ, strong evidence for range expansions could be detected in data from a recent rapid invasion of the cane toad, Rhinella marina, in Australia, but not in 28 previously published empirical data sets from Australian scincid lizards that were significant for the standard range expansion tests. Thus, while clinal variation in ψ is still the most sensitive statistic to range expansions, to detect true signatures of range expansions in natural populations, its magnitude needs to be considered in relation to the overall levels of genetic structuring in the data.


Subject(s)
Genetics, Population , Animals , Genetics, Population/methods , Models, Genetic , Genetic Variation , Introduced Species , Australia , Genetic Drift , Gene Frequency , Founder Effect
7.
PeerJ ; 12: e17248, 2024.
Article in English | MEDLINE | ID: mdl-38666077

ABSTRACT

Whereas undetected species contribute to estimation of species diversity, undetected alleles have not been used to estimated genetic diversity. Although random sampling guarantees unbiased estimation of allele frequency and genetic diversity measures, using undetected alleles may provide biased but more precise estimators useful for conservation. We newly devised kernel density estimation (KDE) for allele frequency including undetected alleles and tested it in estimation of allele frequency and nucleotide diversity using population generated by coalescent simulation as well as well as real population data. Contrary to expectations, nucleotide diversity estimated by KDE had worse bias and accuracy. Allele frequency estimated by KDE was also worse except when the sample size was small. These might be due to finity of population and/or the curse of dimensionality. In conclusion, KDE of allele frequency does not contribute to genetic diversity estimation.


Subject(s)
Alleles , Gene Frequency , Genetic Variation , Genetic Variation/genetics , Humans , Models, Genetic , Computer Simulation , Genetics, Population/methods
8.
PLoS Genet ; 20(4): e1011249, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38669290

ABSTRACT

Polygenic scores (PGS) are measures of genetic risk, derived from the results of genome wide association studies (GWAS). Previous work has proposed the coefficient of determination (R2) as an appropriate measure by which to compare PGS performance in a validation dataset. Here we propose correlation-based methods for evaluating PGS performance by adapting previous work which produced a statistical framework and robust test statistics for the comparison of multiple correlation measures in multiple populations. This flexible framework can be extended to a wider variety of hypothesis tests than currently available methods. We assess our proposed method in simulation and demonstrate its utility with two examples, assessing previously developed PGS for low-density lipoprotein cholesterol and height in multiple populations in the All of Us cohort. Finally, we provide an R package 'coranova' with both parametric and nonparametric implementations of the described methods.


Subject(s)
Genome-Wide Association Study , Multifactorial Inheritance , Humans , Multifactorial Inheritance/genetics , Genome-Wide Association Study/methods , Cholesterol, LDL/blood , Cholesterol, LDL/genetics , Genetic Predisposition to Disease , Models, Genetic , Polymorphism, Single Nucleotide/genetics , Body Height/genetics , Computer Simulation , Genetics, Population/methods
9.
Genes (Basel) ; 15(4)2024 Apr 18.
Article in English | MEDLINE | ID: mdl-38674444

ABSTRACT

The inference of biogeographical ancestry (BGA) can assist in police investigations of serious crime cases and help to identify missing people and victims of mass disasters. In this study, we evaluated the typing performance of 56 ancestry-informative SNPs in 177 samples using the ForenSeq™ DNA Signature Prep Kit on the MiSeq FGx system. Furthermore, we compared the prediction accuracy of the tools Universal Analysis Software v1.2 (UAS), the FROG-kb, and GenoGeographer when inferring the ancestry of 503 Europeans, 22 non-Europeans, and 5 individuals with co-ancestry. The kit was highly sensitive with complete aiSNP profiles in samples with as low as 250pg input DNA. However, in line with others, we observed low read depth and occasional drop-out in some SNPs. Therefore, we suggest not using less than the recommended 1ng of input DNA. FROG-kb and GenoGeographer accurately predicted both Europeans (99.6% and 91.8% correct, respectively) and non-Europeans (95.4% and 90.9% correct, respectively). The UAS was highly accurate when predicting Europeans (96.0% correct) but performed poorer when predicting non-Europeans (40.9% correct). None of the tools were able to correctly predict individuals with co-ancestry. Our study demonstrates that the use of multiple prediction tools will increase the prediction accuracy of BGA inference in forensic casework.


Subject(s)
DNA Fingerprinting , Polymorphism, Single Nucleotide , Humans , Polymorphism, Single Nucleotide/genetics , DNA Fingerprinting/methods , Forensic Genetics/methods , Software , High-Throughput Nucleotide Sequencing/methods , Sequence Analysis, DNA/methods , White People/genetics , Genetics, Population/methods , DNA/genetics
10.
Proc Natl Acad Sci U S A ; 121(19): e2315780121, 2024 May 07.
Article in English | MEDLINE | ID: mdl-38687793

ABSTRACT

Measuring inbreeding and its consequences on fitness is central for many areas in biology including human genetics and the conservation of endangered species. However, there is no consensus on the best method, neither for quantification of inbreeding itself nor for the model to estimate its effect on specific traits. We simulated traits based on simulated genomes from a large pedigree and empirical whole-genome sequences of human data from populations with various sizes and structures (from the 1,000 Genomes project). We compare the ability of various inbreeding coefficients ([Formula: see text]) to quantify the strength of inbreeding depression: allele-sharing, two versions of the correlation of uniting gametes which differ in the weight they attribute to each locus and two identical-by-descent segments-based estimators. We also compare two models: the standard linear model and a linear mixed model (LMM) including a genetic relatedness matrix (GRM) as random effect to account for the nonindependence of observations. We find LMMs give better results in scenarios with population or family structure. Within the LMM, we compare three different GRMs and show that in homogeneous populations, there is little difference among the different [Formula: see text] and GRM for inbreeding depression quantification. However, as soon as a strong population or family structure is present, the strength of inbreeding depression can be most efficiently estimated only if i) the phenotypes are regressed on [Formula: see text] based on a weighted version of the correlation of uniting gametes, giving more weight to common alleles and ii) with the GRM obtained from an allele-sharing relatedness estimator.


Subject(s)
Inbreeding Depression , Models, Genetic , Humans , Pedigree , Genetics, Population/methods , Inbreeding , Alleles
11.
Mol Ecol Resour ; 24(5): e13960, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38676702

ABSTRACT

There is growing interest in uncovering genetic kinship patterns in past societies using low-coverage palaeogenomes. Here, we benchmark four tools for kinship estimation with such data: lcMLkin, NgsRelate, KIN, and READ, which differ in their input, IBD estimation methods, and statistical approaches. We used pedigree and ancient genome sequence simulations to evaluate these tools when only a limited number (1 to 50 K, with minor allele frequency ≥0.01) of shared SNPs are available. The performance of all four tools was comparable using ≥20 K SNPs. We found that first-degree related pairs can be accurately classified even with 1 K SNPs, with 85% F1 scores using READ and 96% using NgsRelate or lcMLkin. Distinguishing third-degree relatives from unrelated pairs or second-degree relatives was also possible with high accuracy (F1 > 90%) with 5 K SNPs using NgsRelate and lcMLkin, while READ and KIN showed lower success (69 and 79% respectively). Meanwhile, noise in population allele frequencies and inbreeding (first-cousin mating) led to deviations in kinship coefficients, with different sensitivities across tools. We conclude that using multiple tools in parallel might be an effective approach to achieve robust estimates on ultra-low-coverage genomes.


Subject(s)
Benchmarking , Pedigree , Polymorphism, Single Nucleotide , Benchmarking/methods , Humans , Gene Frequency , DNA, Ancient/analysis , Computer Simulation , Genetics, Population/methods , Computational Biology/methods
12.
Mol Biol Evol ; 41(5)2024 May 03.
Article in English | MEDLINE | ID: mdl-38630635

ABSTRACT

Bayesian coalescent skyline plot models are widely used to infer demographic histories. The first (non-Bayesian) coalescent skyline plot model assumed a known genealogy as data, while subsequent models and implementations jointly inferred the genealogy and demographic history from sequence data, including heterochronous samples. Overall, there exist multiple different Bayesian coalescent skyline plot models which mainly differ in two key aspects: (i) how changes in population size are modeled through independent or autocorrelated prior distributions, and (ii) how many change-points in the demographic history are used, where they occur and if the number is pre-specified or inferred. The specific impact of each of these choices on the inferred demographic history is not known because of two reasons: first, not all models are implemented in the same software, and second, each model implementation makes specific choices that the biologist cannot influence. To facilitate a detailed evaluation of Bayesian coalescent skyline plot models, we implemented all currently described models in a flexible design into the software RevBayes. Furthermore, we evaluated models and choices on an empirical dataset of horses supplemented by a small simulation study. We find that estimated demographic histories can be grouped broadly into two groups depending on how change-points in the demographic history are specified (either independent of or at coalescent events). Our simulations suggest that models using change-points at coalescent events produce spurious variation near the present, while most models using independent change-points tend to over-smooth the inferred demographic history.


Subject(s)
Bayes Theorem , Genetics, Population , Models, Genetic , Animals , Genetics, Population/methods , Horses , Population Density , Computer Simulation , Software , Demography
13.
Mol Biol Rep ; 51(1): 584, 2024 Apr 29.
Article in English | MEDLINE | ID: mdl-38683231

ABSTRACT

BACKGROUND: Sugar beet (Beta vulgaris L.) holds significant importance as a crop globally cultivated for sugar production. The genetic diversity present in sugar beet accessions plays a crucial role in crop improvement programs. METHODS AND RESULTS: During the present study, we collected 96 sugar beet accessions from different regions and extracted DNA from their leaves. Genomic DNA was amplified using SCoT primers, and the resulting fragments were separated by gel electrophoresis. The data were analyzed using various genetic diversity indices, and constructed a population STRUCTURE, applied the unweighted pair-group method with arithmetic mean (UPGMA), and conducted Principle Coordinate Analysis (PCoA). The results revealed a high level of genetic diversity among the sugar beet accessions, with 265 bands produced by the 10 SCoT primers used. The percentage of polymorphic bands was 97.60%, indicating substantial genetic variation. The study uncovered significant genetic variation, leading to higher values for overall gene diversity (0.21), genetic distance (0.517), number of effective alleles (1.36), Shannon's information index (0.33), and polymorphism information contents (0.239). The analysis of molecular variance suggested a considerable amount of genetic variation, with 89% existing within the population. Using STRUCTURE and UPGMA analysis, the sugar beet germplasm was divided into two major populations. Structure analysis partitioned the germplasm based on the origin and domestication history of sugar beet, resulting in neighboring countries clustering together. CONCLUSION: The utilization of SCoT markers unveiled a noteworthy degree of genetic variation within the sugar beet germplasm in this study. These findings can be used in future breeding programs with the objective of enhancing both sugar beet yield and quality.


Subject(s)
Beta vulgaris , Genetic Variation , Beta vulgaris/genetics , Genetic Variation/genetics , Genetic Markers , Polymorphism, Genetic , Phylogeny , Genetics, Population/methods , Alleles , Plant Breeding/methods , DNA, Plant/genetics
14.
Mol Biol Evol ; 41(5)2024 May 03.
Article in English | MEDLINE | ID: mdl-38636507

ABSTRACT

Inferring past demographic history of natural populations from genomic data is of central concern in many studies across research fields. Previously, our group had developed dadi, a widely used demographic history inference method based on the allele frequency spectrum (AFS) and maximum composite-likelihood optimization. However, dadi's optimization procedure can be computationally expensive. Here, we present donni (demography optimization via neural network inference), a new inference method based on dadi that is more efficient while maintaining comparable inference accuracy. For each dadi-supported demographic model, donni simulates the expected AFS for a range of model parameters then trains a set of Mean Variance Estimation neural networks using the simulated AFS. Trained networks can then be used to instantaneously infer the model parameters from future genomic data summarized by an AFS. We demonstrate that for many demographic models, donni can infer some parameters, such as population size changes, very well and other parameters, such as migration rates and times of demographic events, fairly well. Importantly, donni provides both parameter and confidence interval estimates from input AFS with accuracy comparable to parameters inferred by dadi's likelihood optimization while bypassing its long and computationally intensive evaluation process. donni's performance demonstrates that supervised machine learning algorithms may be a promising avenue for developing more sustainable and computationally efficient demographic history inference methods.


Subject(s)
Gene Frequency , Models, Genetic , Supervised Machine Learning , Genetics, Population/methods , Neural Networks, Computer , Humans
15.
Hum Genet ; 143(3): 371-383, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38499885

ABSTRACT

Massively parallel sequencing (MPS) has emerged as a promising technology for targeting multiple genetic loci simultaneously in forensic genetics. Here, a novel 193-plex panel was designed to target 28 A-STRs, 41 Y-STRs, 21 X-STRs, 3 sex-identified loci, and 100 A-SNPs by employing a single-end 400 bp sequencing strategy on the MGISEQ-2000™ platform. In the present study, a series of validations and sequencing of 1642 population samples were performed to evaluate the overall performance of the MPS-based panel and its practicality in forensic application according to the SWGDAM guidelines. In general, the 193-plex markers in our panel showed good performance in terms of species specificity, stability, and repeatability. Compared to commercial kits, this panel achieved 100% concordance for standard gDNA and 99.87% concordance for 14,560 population genotypes. Moreover, this panel detected 100% of the loci from 0.5 ng of DNA template and all unique alleles at a 1:4 DNA mixture ratio (0.2 ng minor contributor), and the applicability of the proposed approach for tracing and degrading DNA was further supported by case samples. In addition, several forensic parameters of STRs and SNPs were calculated in a population study. High CPE and CPD values greater than 0.9999999 were clearly demonstrated and these results could be useful references for the application of this panel in individual identification and paternity testing. Overall, this 193-plex MPS panel has been shown to be a reliable, repeatable, robust, inexpensive, and powerful tool sufficient for forensic practice.


Subject(s)
Forensic Genetics , High-Throughput Nucleotide Sequencing , Microsatellite Repeats , Paternity , Polymorphism, Single Nucleotide , Humans , High-Throughput Nucleotide Sequencing/methods , Microsatellite Repeats/genetics , Forensic Genetics/methods , Male , Female , Genotype , Alleles , Genetics, Population/methods
16.
Genes Genet Syst ; 992024 May 24.
Article in English | MEDLINE | ID: mdl-38556272

ABSTRACT

Primula secundiflora is an insect-pollinated, perennial herb belonging to the section Proliferae (Primulaceae) that exhibits considerable variation in its mating system, with predominantly outcrossing populations comprising long-styled and short-styled floral morphs and selfing populations comprising only homostyles. To facilitate future investigations of the population genetics and mating patterns of this species, we developed 25 microsatellite markers from P. secundiflora using next-generation sequencing and measured polymorphism and genetic diversity in a sample of 30 individuals from three natural populations. The markers displayed high polymorphism, with the number of observed alleles per locus ranging from three to 16 (mean = 8.36). The observed and expected heterozygosities ranged from 0.100 to 1.000 and 0.145 to 0.843, respectively. Twenty-one of the loci were also successfully amplified in P. denticulata. These microsatellite markers should provide powerful tools for investigating patterns of population genetic diversity and the evolutionary relationships between distyly and homostyly in P. secundiflora.


Subject(s)
Microsatellite Repeats , Polymorphism, Genetic , Primula , Primula/genetics , High-Throughput Nucleotide Sequencing/methods , Alleles , Genetics, Population/methods
17.
Leg Med (Tokyo) ; 68: 102416, 2024 May.
Article in English | MEDLINE | ID: mdl-38325234

ABSTRACT

X-chromosome short tandem repeats (X-STRs) are useful for human identification, especially in complex kinship scenarios. Since forensic statistical parameters vary among populations and the X-STRs population data for the diverse population of Peninsular Malaysia's are unavailable, this attempt for Indians (n = 201) appears forensically relevant to support the 12 X-STRs markers' evidential value for human identification in Malaysia. The Qiagen Investigator® Argus X-12 QS kit showed that DXS10135 was the most polymorphic locus with high genetic diversity, polymorphism information richness, heterozygosity, and exclusion power. Based on allele frequencies, the strength of discrimination and mean exclusion chance (MECKrüger, MECKishida, MECDesmarais, and MECDesmaraisDuo) values for the Malaysian Indians were ≥0.999997790686228. As for haplotype frequencies, the overall discrimination power and mean exclusion probability (MECKrüger, MECKishida, MECDesmarais, and MECDesmaraisDuo) were ≥0.9999984801951. The genetic distance, neighbor-joining phylogenetic tree, and principal component analysis also supported the evidential value of the 12 X-STRs markers for forensic practical caseworks in Malaysia.


Subject(s)
Chromosomes, Human, X , Gene Frequency , Genetic Variation , Microsatellite Repeats , Humans , Malaysia , Microsatellite Repeats/genetics , Chromosomes, Human, X/genetics , Genetics, Population/methods , Forensic Genetics/methods , India , Genetic Markers , DNA Fingerprinting/methods , Male , Haplotypes , Female , Polymorphism, Genetic
18.
J Genet ; 1032024.
Article in English | MEDLINE | ID: mdl-38258299

ABSTRACT

Fixation index (Fst) statistics provide critical insights into evolutionary processes affecting the structure of genetic variation within and among populations. Fst statistics have been widely applied in population and evolutionary genetics to identify genomic regions targeted by selection pressures. The FSTest 1.3 software was developed to estimate four Fst statistics of Hudson, Weir and Cockerham, Nei, and Wright using high-throughput genotyping or sequencing data. Here, we introduced FSTest 1.3 and compared its performance with two widely used software VCFtools 0.1.16 and PLINK 2.0. Chromosome 1 of 1000 Genomes Phase III variant data belonging to South Asian (n = 211) and African (n = 274) populations were included as an example case in this study. Different Fst estimates were calculated for each single-nucleotide polymorphism (SNP) in a pairwise comparison of South Asian against African populations, and the results of FSTest 1.3 were confirmed by VCFtools 0.1.16 and PLINK 2.0. Two different sliding window approaches, one based on a fixed number of SNPs and another based on a fixed number of base pair (bp) were conducted using FSTest 1.3 and VCFtools 0.1.16. Our results showed that regions with low coverage genotypic data could lead to an overestimation of Fst in sliding window analysis using a fixed number of bp. FSTest 1.3 could mitigate this challenge by estimating the average of consecutive SNPs along the chromosome. FSTest 1.3 allows direct analysis of VCF files with a small amount of code and can calculate Fst estimates on a desktop computer for more than a million SNPs in a few minutes. FSTest 1.3 is freely available at https://github.com/similab/FSTest.


Subject(s)
African People , Chromosomes, Human, Pair 1 , Genetic Variation , Genetics, Population , South Asian People , Humans , Asian People/genetics , Biological Evolution , Chromosomes, Human, Pair 1/genetics , Genomics , Genotype , Genetics, Population/methods , Genetics, Population/statistics & numerical data , South Asian People/genetics , African People/genetics , Genetic Variation/genetics
19.
Brief Bioinform ; 25(2)2024 Jan 22.
Article in English | MEDLINE | ID: mdl-38261339

ABSTRACT

Various methods have been proposed to reconstruct admixture histories by analyzing the length of ancestral chromosomal tracts, such as estimating the admixture time and number of admixture events. However, available methods do not explicitly consider the complex admixture structure, which characterizes the joining and mixing patterns of different ancestral populations during the admixture process, and instead assume a simplified one-by-one sequential admixture model. In this study, we proposed a novel approach that considers the non-sequential admixture structure to reconstruct admixture histories. Specifically, we introduced a hierarchical admixture model that incorporated four ancestral populations and developed a new method, called HierarchyMix, which uses the length of ancestral tracts and the number of ancestry switches along genomes to reconstruct the four-way admixture history. By automatically selecting the optimal admixture model using the Bayesian information criterion principles, HierarchyMix effectively estimates the corresponding admixture parameters. Simulation studies confirmed the effectiveness and robustness of HierarchyMix. We also applied HierarchyMix to Uyghurs and Kazakhs, enabling us to reconstruct the admixture histories of Central Asians. Our results highlight the importance of considering complex admixture structures and demonstrate that HierarchyMix is a useful tool for analyzing complex admixture events.


Subject(s)
Central Asian People , Genetics, Population , Humans , Bayes Theorem , Central Asian People/genetics , Computer Simulation , Chromosomes/genetics , Genetics, Population/methods
20.
Leg Med (Tokyo) ; 58: 102082, 2022 Sep.
Article in English | MEDLINE | ID: mdl-35584562

ABSTRACT

Allele frequencies for 31 autosomal short tandem repeat (STR) loci (CSF1PO, D10S1248, D12ATA63, D12S391, D13S317, D14S1434, D16S539, D18S51, D19S433, D1S1656, D1S1677, D21S11, D22S1045, D2S1338, D2S1776, D2S441, D3S1358, D3S4529, D4S2408, D5S2800, D5S818, D6S1043, D6S474, D7S820, D8S1179, FGA, Penta D, Penta E, TH01, TPOX, and vWA) were obtained using Precision ID GlobalFiler NGS STR Panel v2 from 82 unrelated individuals sampled from the Japanese population. Autosomal STR alleles designated by NGS and conventional capillary electrophoresis were found to be concordant except at D2S441 allele 9.


Subject(s)
DNA Fingerprinting , Gene Frequency , Genetics, Population , Microsatellite Repeats , DNA Fingerprinting/methods , Gene Frequency/genetics , Genetics, Population/methods , High-Throughput Nucleotide Sequencing/methods , Humans , Japan , Microsatellite Repeats/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...