Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 3.778
Filter
1.
BMC Genomics ; 25(1): 475, 2024 May 14.
Article in English | MEDLINE | ID: mdl-38745120

ABSTRACT

BACKGROUND: Single nucleotide polymorphism (SNP) markers play significant roles in accelerating breeding and basic crop research. Several soybean SNP panels have been developed. However, there is still a lack of SNP panels for differentiating between wild and cultivated populations, as well as for detecting polymorphisms within both wild and cultivated populations. RESULTS: This study utilized publicly available resequencing data from over 3,000 soybean accessions to identify differentiating and highly conserved SNP and insertion/deletion (InDel) markers between wild and cultivated soybean populations. Additionally, a naturally occurring mutant gene library was constructed by analyzing large-effect SNPs and InDels in the population. CONCLUSION: The markers obtained in this study are associated with numerous genes governing agronomic traits, thus facilitating the evaluation of soybean germplasms and the efficient differentiation between wild and cultivated soybeans. The natural mutant gene library permits the quick identification of individuals with natural mutations in functional genes, providing convenience for accelerating soybean breeding using reverse genetics.


Subject(s)
Glycine max , INDEL Mutation , Polymorphism, Single Nucleotide , Glycine max/genetics , Genome, Plant , Gene Library , Plant Breeding
2.
Genes (Basel) ; 15(5)2024 May 07.
Article in English | MEDLINE | ID: mdl-38790221

ABSTRACT

Early-onset breast cancer (EoBC), defined by a diagnosis <40 years of age, is associated with poor prognosis. This study investigated the mutational landscape of non-metastatic EoBC and the prognostic relevance of mutational signatures using 100 tumour samples from Alberta, Canada. The MutationalPatterns package in R/Bioconductor was used to extract de novo single-base substitution (SBS) and insertion-deletion (indel) mutational signatures and to fit COSMIC SBS and indel signatures. We assessed associations between these signatures and clinical characteristics of disease, in addition to recurrence-free (RFS) and overall survival (OS). Five SBS and two indel signatures were extracted. The SBS13-like signature had higher relative contributions in the HER2-enriched subtype. Patients with higher than median contribution tended to have better RFS after adjustment for other prognostic factors (HR = 0.29; 95% CI: 0.08-1.06). An unsupervised clustering algorithm based on absolute contribution revealed three clusters of fitted COSMIC SBS signatures, but cluster membership was not associated with clinical variables or survival outcomes. The results of this exploratory study reveal various SBS and indel signatures may be associated with clinical features of disease and prognosis. Future studies with larger samples are required to better understand the mechanistic underpinnings of disease progression and treatment response in EoBC.


Subject(s)
Breast Neoplasms , Humans , Female , Breast Neoplasms/genetics , Breast Neoplasms/pathology , Breast Neoplasms/mortality , Adult , Prognosis , Age of Onset , Mutation , INDEL Mutation , Biomarkers, Tumor/genetics , Alberta/epidemiology , Middle Aged
3.
Genes (Basel) ; 15(5)2024 May 11.
Article in English | MEDLINE | ID: mdl-38790244

ABSTRACT

BACKGROUND: Leukoencephalopathy with brainstem and spinal cord involvement and lactate elevation is an inherited disease caused by pathogenic biallelic variants in the gene DARS2, which encodes mitochondrial aspartyl-tRNA synthetase. This disease is characterized by slowly progressive spastic gait, cerebellar symptoms, and leukoencephalopathy with brainstem and spinal cord involvement. CASE PRESENTATION: Peripheral blood samples were collected from four patients from four unrelated families to extract genomic DNA. All patients underwent partial exon analysis of the DARS2 gene using Sanger sequencing, which detected the c.228-21_228-20delinsC variant in a heterozygous state. Further DNA from three patients was analyzed using a next-generation sequencing-based custom AmpliSeq™ panel for 59 genes associated with leukodystrophies, and one of the patients underwent whole genome sequencing. We identified a novel pathogenic variant c.1675-1256_*115delinsGCAACATTTCGGCAACATTCCAACC in the DARS2 gene. Three patients (patients 1, 2, and 4) had slowly progressive cerebellar ataxia, and two patients (patients 1 and 2) had spasticity. In addition, two patients (patients 2 and 4) showed signs of axonal neuropathy, such as decreased tendon reflexes and loss of distal sensitivity. Three patients (patients 1, 2, and 3) also had learning difficulties. It should be noted the persistent presence of characteristic changes in brain MRI in all patients, which emphasizes its importance as the main diagnostic tool for suspicion and subsequent confirmation of LBSL. Conclusions: We found a novel indel variant in the DARS2 gene in four patients with LBSL and described their clinical and genetic characteristics. These results expand the mutational spectrum of LBSL and aim to improve the laboratory diagnosis of this form of leukodystrophy.


Subject(s)
Aspartate-tRNA Ligase , INDEL Mutation , Leukoencephalopathies , Humans , Aspartate-tRNA Ligase/genetics , Aspartate-tRNA Ligase/deficiency , Male , Leukoencephalopathies/genetics , Leukoencephalopathies/pathology , Female , Brain Stem/pathology , Brain Stem/diagnostic imaging , Child , Lactic Acid/blood , Russia , Adult , Spinal Cord/pathology , Spinal Cord/diagnostic imaging , Adolescent , Mitochondrial Diseases
4.
Int J Mol Sci ; 25(10)2024 May 10.
Article in English | MEDLINE | ID: mdl-38791234

ABSTRACT

As a physical mutagen, carbon ion beam (CIB) irradiation can induce high-frequency mutation, which is user-friendly and environment-friendly in plant breeding. In this study, we resequenced eight mutant lines which were screened out from the progeny of the CIB-irradiated dehulled rice seeds. Among these mutants, CIB induced 135,535 variations, which include single base substitutions (SBSs), and small insertion and deletion (InDels). SBSs are the most abundant mutation, and account for 88% of all variations. Single base conversion is the main type of SBS, and the average ratio of transition and transversion is 1.29, and more than half of the InDels are short-segmented mutation (1-2 bp). A total of 69.2% of the SBSs and InDels induced by CIBs occurred in intergenic regions on the genome. Surprisingly, the average mutation frequency in our study is 9.8 × 10-5/bp and much higher than that of the previous studies, which may result from the relatively high irradiation dosage and the dehulling of seeds for irradiation. By analyzing the mutation of every 1 Mb in the genome of each mutant strain, we found some unusual high-frequency (HF) mutation regions, where SBSs and InDels colocalized. This study revealed the mutation mechanism of dehulled rice seeds by CIB irradiation on the genome level, which will enrich our understanding of the mutation mechanism of CIB radiation and improve mutagenesis efficiency.


Subject(s)
Genome, Plant , Mutation , Oryza , Seeds , Oryza/genetics , Oryza/radiation effects , Seeds/genetics , Seeds/radiation effects , Carbon , INDEL Mutation , Heavy Ions
5.
Funct Integr Genomics ; 24(3): 104, 2024 May 20.
Article in English | MEDLINE | ID: mdl-38764005

ABSTRACT

Accurate estimation of population allele frequency (AF) is crucial for gene discovery and genetic diagnostics. However, determining AF for frameshift-inducing small insertions and deletions (indels) faces challenges due to discrepancies in mapping and variant calling methods. Here, we propose an innovative approach to assess indel AF. We developed CRAFTS-indels (Calculating Regional Allele Frequency Targeting Small indels), an algorithm that combines AF of distinct indels within a given region and provides "regional AF" (rAF). We tested and validated CRAFTS-indels using three independent datasets: gnomAD v2 (n=125,748 samples), an internal dataset (IGM; n=39,367), and the UK BioBank (UKBB; n=469,835). By comparing rAF against standard AF, we identified rare indels with rAF exceeding standard AF (sAF≤10-4 and rAF>10-4) as "rAF-hi" indels. Notably, a high percentage of rare indels were "rAF-hi", with a higher proportion in gnomAD v2 (11-20%) and IGM (11-22%) compared to the UKBB (5-9% depending on the CRAFTS-indels' parameters). Analysis of the overlap of regions based on their rAF with low complexity regions and with ClinVar classification supported the pertinence of rAF. Using the internal dataset, we illustrated the utility of CRAFTS-indel in the analysis of de novo variants and the potential negative impact of rAF-hi indels in gene discovery. In summary, annotation of indels with cohort specific rAF can be used to handle some of the limitations of current annotation pipelines and facilitate detection of novel gene disease associations. CRAFTS-indels offers a user-friendly approach to providing rAF annotation. It can be integrated into public databases such as gnomAD, UKBB and used by ClinVar to revise indel classifications.


Subject(s)
Gene Frequency , INDEL Mutation , Humans , Algorithms
6.
Theor Appl Genet ; 137(6): 136, 2024 May 20.
Article in English | MEDLINE | ID: mdl-38764078

ABSTRACT

KEY MESSAGE: Different kinship and resistance to cotton leaf curl disease (CLCuD) and heat were found between upland cotton cultivars from China and Pakistan. 175 SNPs and 82 InDels loci related to yield, fiber quality, CLCuD, and heat resistance were identified. Elite alleles found in Pakistani accessions aided local adaptation to climatic condition of two countries. Adaptation of upland cotton (Gossypium hirsutum) beyond its center of origin is expected to be driven by tailoring of the genome and genes to enhance yield and quality in new ecological niches. Here, resequencing of 456 upland cotton accessions revealed two distinct kinships according to the associated country. Fiber quality and lint percentage were consistent across kinships, but resistance to cotton leaf curl disease (CLCuD) and heat was distinctly exhibited by accessions from Pakistan, illustrating highly local adaption. A total of 175 SNP and 82 InDel loci related to yield, fiber quality, CLCuD and heat resistance were identified; among them, only two overlapped between Pakistani and Chinese accessions underscoring the divergent domestication and improvement targets in each country. Loci associated with resistance alleles to leaf curl disease and high temperature were largely found in Pakistani accessions to counter these stresses prevalent in Pakistan. These results revealed that breeding activities led to the accumulation of unique alleles and helped upland cotton become adapted to the respective climatic conditions, which will contribute to elucidating the genetic mechanisms that underlie resilience traits and help develop climate-resilient cotton cultivars for use worldwide.


Subject(s)
Gossypium , Polymorphism, Single Nucleotide , Gossypium/genetics , Pakistan , China , Disease Resistance/genetics , Plant Diseases/genetics , INDEL Mutation , Adaptation, Physiological/genetics , Genome, Plant , Alleles , Plant Breeding , Cotton Fiber , Phenotype
7.
Orphanet J Rare Dis ; 19(1): 209, 2024 May 21.
Article in English | MEDLINE | ID: mdl-38773661

ABSTRACT

BACKGROUND: Marfan syndrome (MFS) is an autosomal dominant connective tissue disease with wide clinical heterogeneity, and mainly caused by pathogenic variants in fibrillin-1 (FBN1). METHODS: A Chinese 4-generation MFS pedigree with 16 family members was recruited and exome sequencing (ES) was performed in the proband. Transcript analysis (patient RNA and minigene assays) and in silico structural analysis were used to determine the pathogenicity of the variant. In addition, germline mosaicism in family member (Ι:1) was assessed using quantitative fluorescent polymerase chain reaction (QF-PCR) and short tandem repeat PCR (STR) analyses. RESULTS: Two cis-compound benign intronic variants of FBN1 (c.3464-4 A > G and c.3464-5G > A) were identified in the proband by ES. As a compound variant, c.3464-5_3464-4delGAinsAG was found to be pathogenic and co-segregated with MFS. RNA studies indicated that aberrant transcripts were found only in patients and mutant-type clones. The variant c.3464-5_3464-4delGAinsAG caused erroneous integration of a 3 bp sequence into intron 28 and resulted in the insertion of one amino acid in the protein sequence (p.Ile1154_Asp1155insAla). Structural analyses suggested that p.Ile1154_Asp1155insAla affected the protein's secondary structure by interfering with one disulfide bond between Cys1140 and Cys1153 and causing the extension of an anti-parallel ß sheet in the calcium-binding epidermal growth factor-like (cbEGF)13 domain. In addition, the asymptomatic family member Ι:1 was deduced to be a gonadal mosaic as assessed by inconsistent results of sequencing and STR analysis. CONCLUSIONS: To our knowledge, FBN1 c.3464-5_3464-4delGAinsAG is the first identified pathogenic intronic indel variant affecting non-canonical splice sites in this gene. Our study reinforces the importance of assessing the pathogenic role of intronic variants at the mRNA level, with structural analysis, and the occurrence of mosaicism.


Subject(s)
Fibrillin-1 , Introns , Marfan Syndrome , Mosaicism , Pedigree , Humans , Fibrillin-1/genetics , Marfan Syndrome/genetics , Marfan Syndrome/pathology , Female , Male , Adult , Introns/genetics , INDEL Mutation/genetics , Middle Aged , Adipokines
8.
PLoS One ; 19(5): e0302870, 2024.
Article in English | MEDLINE | ID: mdl-38776345

ABSTRACT

The systematic identification of insertion/deletion (InDel) length polymorphisms from the entire lentil genome can be used to map the quantitative trait loci (QTL) and also for the marker-assisted selection (MAS) for various linked traits. The InDels were identified by comparing the whole-genome resequencing (WGRS) data of two extreme bulks (early- and late-flowering bulk) and a parental genotype (Globe Mutant) of lentil. The bulks were made by pooling 20 extreme recombinant inbred lines (RILs) each, derived by crossing Globe Mutant (late flowering parent) with L4775 (early flowering parent). Finally, 734,716 novel InDels were identified, which is nearly one InDel per 5,096 bp of lentil genome. Furthermore, 74.94% of InDels were within the intergenic region and 99.45% displayed modifier effects. Of these, 15,732 had insertions or deletions of 20 bp or more, making them amenable to the development of PCR-based markers. An InDel marker I-SP-356.6 (chr. 3; position 356,687,623; positioned 174.5 Kb from the LcFRI gene) was identified as having a phenotypic variance explained (PVE) value of 47.7% for earliness when validated in a RIL population. Thus, I-SP-356.6 marker can be deployed in MAS to facilitate the transfer of the earliness trait to other elite late-maturing cultivars. Two InDel markers viz., I-SP-356.6 and I-SP-383.9 (chr. 3; linked to LcELF3a gene) when tested in 9 lentil genotypes differing for maturity duration, clearly distinguished three early (L4775, ILL7663, Precoz) and four late genotypes (Globe Mutant, MFX, L4602, L830). However, these InDels could not be validated in two genotypes (L4717, L4727), suggesting either absence of polymorphism and/or presence of other loci causing earliness. The identified InDel markers can act as valuable tools for MAS for the development of early maturing lentil varieties.


Subject(s)
Genome, Plant , Genotype , INDEL Mutation , Lens Plant , Quantitative Trait Loci , Lens Plant/genetics , Lens Plant/growth & development , Genetic Markers , Polymerase Chain Reaction/methods , Chromosome Mapping/methods
9.
F1000Res ; 13: 146, 2024.
Article in English | MEDLINE | ID: mdl-38779312

ABSTRACT

Background: Previous studies have linked genetics to knee osteoarthritis. Angiotensin-converting enzyme (ACE) gene I/D polymorphism may cause OA. However, evidence remains inconsistent. This study examines knee OA risk and ACE gene I/D polymorphism. Methods: We explored Europe PMC, Medline, Scopus, and Cochrane Library using keywords. Three assessment bias factors were assessed using the Newcastle-Ottawa Scale (NOS). Criteria for inclusion: (1) Split the study population into knee OA patients and healthy controls; (2) Analysed the ACE gene I/D polymorphism; (3) Case-control or cross-sectional surveys. Studies with non-knee OA, incomplete data, and no full-text were excluded. The odds ratio (OR) and 95% confidence intervals (95% CI) were calculated using random-effect models. Results: A total of 6 case-control studies consist of 1,226 patients with knee OA and 1,145 healthy subjects as controls were included. Our pooled analysis revealed that a significant association between ACE gene I/D polymorphism and risk of knee OA was only seen in the dominant (DD + ID vs. II) [OR 1.69 (95% CI 1.14 - 2.50), p = 0.009, I2 = 72%], and ID vs. II [OR 1.37 (95% CI 1.01- 1.86), p = 0.04, I2 = 43%] genotype models. Other genotype models, including recessive (DD vs. ID + II), alleles (D vs. I), DD vs. ID, and DD vs. II models did not show a significant association with knee OA risk. Further regression analysis revealed that ethnicity and sex may influence those relationships in several genotype models. Conclusions: Dominant and ID vs. II ACE gene I/D polymorphism models increased knee OA risk significantly. More research with larger samples and different ethnic groups is needed to confirm our findings. After ethnicity subgroup analysis, some genetic models in our study showed significant heterogeneities, and most studies are from Asian countries with Asian populations, with little evidence on Arabs.


Subject(s)
Genetic Predisposition to Disease , Osteoarthritis, Knee , Peptidyl-Dipeptidase A , Polymorphism, Genetic , Humans , Case-Control Studies , Genetic Association Studies , INDEL Mutation , Osteoarthritis, Knee/genetics , Peptidyl-Dipeptidase A/genetics , Risk Factors
10.
BMC Genomics ; 25(1): 515, 2024 May 25.
Article in English | MEDLINE | ID: mdl-38796435

ABSTRACT

BACKGROUND: The short-read whole-genome sequencing (WGS) approach has been widely applied to investigate the genomic variation in the natural populations of many plant species. With the rapid advancements in long-read sequencing and genome assembly technologies, high-quality genome sequences are available for a group of varieties for many plant species. These genome sequences are expected to help researchers comprehensively investigate any type of genomic variants that are missed by the WGS technology. However, multiple genome alignment (MGA) tools designed by the human genome research community might be unsuitable for plant genomes. RESULTS: To fill this gap, we developed the AnchorWave-Cactus Multiple Genome Alignment (ACMGA) pipeline, which improved the alignment of repeat elements and could identify long (> 50 bp) deletions or insertions (INDELs). We conducted MGA using ACMGA and Cactus for 8 Arabidopsis (Arabidopsis thaliana) and 26 Maize (Zea mays) de novo assembled genome sequences and compared them with the previously published short-read variant calling results. MGA identified more single nucleotide variants (SNVs) and long INDELs than did previously published WGS variant callings. Additionally, ACMGA detected significantly more SNVs and long INDELs in repetitive regions and the whole genome than did Cactus. Compared with the results of Cactus, the results of ACMGA were more similar to the previously published variants called using short-read. These two MGA pipelines identified numerous multi-allelic variants that were missed by the WGS variant calling pipeline. CONCLUSIONS: Aligning de novo assembled genome sequences could identify more SNVs and INDELs than mapping short-read. ACMGA combines the advantages of AnchorWave and Cactus and offers a practical solution for plant MGA by integrating global alignment, a 2-piece-affine-gap cost strategy, and the progressive MGA algorithm.


Subject(s)
Arabidopsis , Genome, Plant , Zea mays , Arabidopsis/genetics , Zea mays/genetics , Sequence Alignment , INDEL Mutation , Genomics/methods , Polymorphism, Single Nucleotide , Whole Genome Sequencing/methods , Software
11.
Genome Biol Evol ; 16(5)2024 May 02.
Article in English | MEDLINE | ID: mdl-38735759

ABSTRACT

A fundamental goal in evolutionary biology and population genetics is to understand how selection shapes the fate of new mutations. Here, we test the null hypothesis that insertion-deletion (indel) events in protein-coding regions occur randomly with respect to secondary structures. We identified indels across 11,444 sequence alignments in mouse, rat, human, chimp, and dog genomes and then quantified their overlap with four different types of secondary structure-alpha helices, beta strands, protein bends, and protein turns-predicted by deep-learning methods of AlphaFold2. Indels overlapped secondary structures 54% as much as expected and were especially underrepresented over beta strands, which tend to form internal, stable regions of proteins. In contrast, indels were enriched by 155% over regions without any predicted secondary structures. These skews were stronger in the rodent lineages compared to the primate lineages, consistent with population genetic theory predicting that natural selection will be more efficient in species with larger effective population sizes. Nonsynonymous substitutions were also less common in regions of protein secondary structure, although not as strongly reduced as in indels. In a complementary analysis of thousands of human genomes, we showed that indels overlapping secondary structure segregated at significantly lower frequency than indels outside of secondary structure. Taken together, our study shows that indels are selected against if they overlap secondary structure, presumably because they disrupt the tertiary structure and function of a protein.


Subject(s)
INDEL Mutation , Protein Structure, Secondary , Humans , Animals , Mice , Rats , Evolution, Molecular , Proteins/genetics , Proteins/chemistry , Dogs , Selection, Genetic , Genome
12.
Genome Biol ; 25(1): 101, 2024 Apr 19.
Article in English | MEDLINE | ID: mdl-38641647

ABSTRACT

Many bioinformatics methods seek to reduce reference bias, but no methods exist to comprehensively measure it. Biastools analyzes and categorizes instances of reference bias. It works in various scenarios: when the donor's variants are known and reads are simulated; when donor variants are known and reads are real; and when variants are unknown and reads are real. Using biastools, we observe that more inclusive graph genomes result in fewer biased sites. We find that end-to-end alignment reduces bias at indels relative to local aligners. Finally, we use biastools to characterize how T2T references improve large-scale bias.


Subject(s)
Genome , Genomics , Genomics/methods , Computational Biology , INDEL Mutation , Bias , Sequence Analysis, DNA/methods , Software , High-Throughput Nucleotide Sequencing/methods
13.
J Agric Food Chem ; 72(17): 10138-10148, 2024 May 01.
Article in English | MEDLINE | ID: mdl-38637271

ABSTRACT

Passion fruit (Passiflora spp.) is an important fruit tree in the family Passifloraceae. The color of the fruit skin, a significant agricultural trait, is determined by the content of anthocyanin in passion fruit. However, the regulatory mechanisms behind the accumulation of anthocyanin in different passion fruit skin colors remain unclear. In the study, we identified and characterized a R2R3-MYB transcription factor, PeMYB114, which functions as a transcriptional activator in anthocyanin biosynthesis. Yeast one-hybrid system and dual-luciferase analysis showed that PeMYB114 could directly activate the expression of anthocyanin structural genes (PeCHS and PeDFR). Furthermore, a natural variation in the promoter region of PeMYB114 alters its expression. PeMYB114purple accessions with the 224-bp insertion have a higher anthocyanin level than PeMYB114yellow accessions with the 224-bp deletion. The findings enhance our understanding of anthocyanin accumulation in fruits and provide genetic resources for genome design for improving passion fruit quality.


Subject(s)
Anthocyanins , Fruit , Gene Expression Regulation, Plant , Passiflora , Plant Proteins , Promoter Regions, Genetic , Transcription Factors , Anthocyanins/metabolism , Anthocyanins/genetics , Passiflora/genetics , Passiflora/metabolism , Passiflora/chemistry , Fruit/metabolism , Fruit/genetics , Fruit/chemistry , Plant Proteins/genetics , Plant Proteins/metabolism , Transcription Factors/genetics , Transcription Factors/metabolism , INDEL Mutation
14.
BMC Biol ; 22(1): 90, 2024 Apr 22.
Article in English | MEDLINE | ID: mdl-38644496

ABSTRACT

BACKGROUND: Accurate identification of genetic variants, such as point mutations and insertions/deletions (indels), is crucial for various genetic studies into epidemic tracking, population genetics, and disease diagnosis. Genetic studies into microbiomes often require processing numerous sequencing datasets, necessitating variant identifiers with high speed, accuracy, and robustness. RESULTS: We present QuickVariants, a bioinformatics tool that effectively summarizes variant information from read alignments and identifies variants. When tested on diverse bacterial sequencing data, QuickVariants demonstrates a ninefold higher median speed than bcftools, a widely used variant identifier, with higher accuracy in identifying both point mutations and indels. This accuracy extends to variant identification in virus samples, including SARS-CoV-2, particularly with significantly fewer false negative indels than bcftools. The high accuracy of QuickVariants is further demonstrated by its detection of a greater number of Omicron-specific indels (5 versus 0) and point mutations (61 versus 48-54) than bcftools in sewage metagenomes predominated by Omicron variants. Much of the reduced accuracy of bcftools was attributable to its misinterpretation of indels, often producing false negative indels and false positive point mutations at the same locations. CONCLUSIONS: We introduce QuickVariants, a fast, accurate, and robust bioinformatics tool designed for identifying genetic variants for microbial studies. QuickVariants is available at https://github.com/caozhichongchong/QuickVariants .


Subject(s)
INDEL Mutation , SARS-CoV-2 , SARS-CoV-2/genetics , Computational Biology/methods , Humans , Software , COVID-19/virology , High-Throughput Nucleotide Sequencing/methods , Point Mutation , Genetic Variation , Sequence Analysis, DNA/methods
15.
Sci Rep ; 14(1): 8165, 2024 04 08.
Article in English | MEDLINE | ID: mdl-38589653

ABSTRACT

Accurately calling indels with next-generation sequencing (NGS) data is critical for clinical application. The precisionFDA team collaborated with the U.S. Food and Drug Administration's (FDA's) National Center for Toxicological Research (NCTR) and successfully completed the NCTR Indel Calling from Oncopanel Sequencing Data Challenge, to evaluate the performance of indel calling pipelines. Top performers were selected based on precision, recall, and F1-score. The performance of many other pipelines was close to the top performers, which produced a top cluster of performers. The performance was significantly higher in high confidence regions and coding regions, and significantly lower in low complexity regions. Oncopanel capture and other issues may have occurred that affected the recall rate. Indels with higher variant allele frequency (VAF) may generally be called with higher confidence. Many of the indel calling pipelines had good performance. Some of them performed generally well across all three oncopanels, while others were better for a specific oncopanel. The performance of indel calling can further be improved by restricting the calls within high confidence intervals (HCIs) and coding regions, and by excluding low complexity regions (LCR) regions. Certain VAF cut-offs could be applied according to the applications.


Subject(s)
High-Throughput Nucleotide Sequencing , INDEL Mutation , Polymorphism, Single Nucleotide
16.
BMC Genomics ; 25(1): 329, 2024 Apr 02.
Article in English | MEDLINE | ID: mdl-38566035

ABSTRACT

BACKGROUND: Previously, a novel multiplex system of 64 loci was constructed based on capillary electrophoresis platform, including 59 autosomal insertion/deletions (A-InDels), two Y-chromosome InDels, two mini short tandem repeats (miniSTRs), and an Amelogenin gene. The aim of this study is to evaluate the efficiencies of this multiplex system for individual identification, paternity testing and biogeographic ancestry inference in Chinese Hezhou Han (CHH) and Hubei Tujia (CTH) groups, providing valuable insights for forensic anthropology and population genetics research. RESULTS: The cumulative values of power of discrimination (CDP) and probability of exclusion (CPE) for the 59 A-InDels and two miniSTRs were 0.99999999999999999999999999754, 0.99999905; and 0.99999999999999999999999999998, 0.99999898 in CTH and CHH groups, respectively. When the likelihood ratio thresholds were set to 1 or 10, more than 95% of the full sibling pairs could be identified from unrelated individual pairs, and the false positive rates were less than 1.2% in both CTH and CHH groups. Biogeographic ancestry inference models based on 35 populations were constructed with three algorithms: random forest, adaptive boosting and extreme gradient boosting, and then 10-fold cross-validation analyses were applied to test these three models with the average accuracies of 86.59%, 84.22% and 87.80%, respectively. In addition, we also investigated the genetic relationships between the two studied groups with 33 reference populations using population statistical methods of FST, DA, phylogenetic tree, PCA, STRUCTURE and TreeMix analyses. The present results showed that compared to other continental populations, the CTH and CHH groups had closer genetic affinities to East Asian populations. CONCLUSIONS: This novel multiplex system has high CDP and CPE in CTH and CHH groups, which can be used as a powerful tool for individual identification and paternity testing. According to various genetic analysis methods, the genetic structures of CTH and CHH groups are relatively similar to the reference East Asian populations.


Subject(s)
Genetics, Population , Siblings , Humans , Phylogeny , China , INDEL Mutation , Microsatellite Repeats , Forensic Genetics/methods , Gene Frequency
17.
BMC Genomics ; 25(1): 428, 2024 Apr 30.
Article in English | MEDLINE | ID: mdl-38689225

ABSTRACT

BACKGROUND: Although many studies have been done to reveal artificial selection signatures in commercial and indigenous chickens, a limited number of genes have been linked to specific traits. To identify more trait-related artificial selection signatures and genes, we re-sequenced a total of 85 individuals of five indigenous chicken breeds with distinct traits from Yunnan Province, China. RESULTS: We found 30 million non-redundant single nucleotide variants and small indels (< 50 bp) in the indigenous chickens, of which 10 million were not seen in 60 broilers, 56 layers and 35 red jungle fowls (RJFs) that we compared with. The variants in each breed are enriched in non-coding regions, while those in coding regions are largely tolerant, suggesting that most variants might affect cis-regulatory sequences. Based on 27 million bi-allelic single nucleotide polymorphisms identified in the chickens, we found numerous selective sweeps and affected genes in each indigenous chicken breed and substantially larger numbers of selective sweeps and affected genes in the broilers and layers than previously reported using a rigorous statistical model. Consistent with the locations of the variants, the vast majority (~ 98.3%) of the identified selective sweeps overlap known quantitative trait loci (QTLs). Meanwhile, 74.2% known QTLs overlap our identified selective sweeps. We confirmed most of previously identified trait-related genes and identified many novel ones, some of which might be related to body size and high egg production traits. Using RT-qPCR, we validated differential expression of eight genes (GHR, GHRHR, IGF2BP1, OVALX, ELF2, MGARP, NOCT, SLC25A15) that might be related to body size and high egg production traits in relevant tissues of relevant breeds. CONCLUSION: We identify 30 million single nucleotide variants and small indels in the five indigenous chicken breeds, 10 million of which are novel. We predict substantially more selective sweeps and affected genes than previously reported in both indigenous and commercial breeds. These variants and affected genes are good candidates for further experimental investigations of genotype-phenotype relationships and practical applications in chicken breeding programs.


Subject(s)
Chickens , Polymorphism, Single Nucleotide , Quantitative Trait Loci , Selection, Genetic , Animals , Chickens/genetics , Genome , INDEL Mutation , Breeding , Phenotype , Genomics/methods
18.
BMC Genomics ; 25(1): 391, 2024 Apr 22.
Article in English | MEDLINE | ID: mdl-38649797

ABSTRACT

Developmental delay (DD), or intellectual disability (ID) is a very large group of early onset disorders that affects 1-2% of children worldwide, which have diverse genetic causes that should be identified. Genetic studies can elucidate the pathogenesis underlying DD/ID. In this study, whole-exome sequencing (WES) was performed on 225 Chinese DD/ID children (208 cases were sequenced as proband-parent trio) who were classified into seven phenotype subgroups. The phenotype and genomic data of patients with DD/ID were further retrospectively analyzed. There were 96/225 (42.67%; 95% confidence interval [CI] 36.15-49.18%) patients were found to have causative single nucleotide variants (SNVs) and small insertions/deletions (Indels) associated with DD/ID based on WES data. The diagnostic yields among the seven subgroups ranged from 31.25 to 71.43%. Three specific clinical features, hearing loss, visual loss, and facial dysmorphism, can significantly increase the diagnostic yield of WES in patients with DD/ID (P = 0.005, P = 0.005, and P = 0.039, respectively). Of note, hearing loss (odds ratio [OR] = 1.86%; 95% CI = 1.00-3.46, P = 0.046) or abnormal brainstem auditory evoked potential (BAEP) (OR = 1.91, 95% CI = 1.02-3.50, P = 0.042) was independently associated with causative genetic variants in DD/ID children. Our findings enrich the variation spectrums of SNVs/Indels associated with DD/ID, highlight the value genetic testing for DD/ID children, stress the importance of BAEP screen in DD/ID children, and help to facilitate early diagnose, clinical management and reproductive decisions, improve therapeutic response to medical treatment.


Subject(s)
Developmental Disabilities , Exome Sequencing , Intellectual Disability , Child , Child, Preschool , Female , Humans , Infant , Male , Developmental Disabilities/genetics , Developmental Disabilities/diagnosis , East Asian People/genetics , INDEL Mutation , Intellectual Disability/genetics , Phenotype , Polymorphism, Single Nucleotide
19.
Physiol Genomics ; 56(6): 436-444, 2024 Jun 01.
Article in English | MEDLINE | ID: mdl-38586874

ABSTRACT

This study aimed to investigate the relationship between pre- and postexercise cardiac biomarker release according to athletic status (trained vs. untrained) and to establish whether the I/D polymorphism in the angiotensin-converting enzyme (ACE) gene had an influence on cardiac biomarkers release with specific regard on the influence of the training state. We determined cardiac troponin I (cTnI) and N-terminal pro-brain natriuretic peptide (NT-proBNP) in 29 trained and 27 untrained male soccer players before and after moderate-intensity continuous exercise (MICE) and high-intensity interval exercise (HIIE) running tests. Trained soccer players had higher pre (trained: 0.014 ± 0.007 ng/mL; untrained: 0.010 ± 0.005 ng/mL) and post HIIE (trained: 0.031 ± 0.008 ng/mL; untrained: 0.0179 ± 0.007) and MICE (trained: 0.030 ± 0.007 ng/mL; untrained: 0.018 ± 0.007) cTnI values than untrained subjects, but the change with exercise (ΔcTnI) was similar between groups. There was no significant difference in baseline and postexercise NT-proBNP between groups. NT-proBNP levels were elevated after both HIIE and MICE. Considering three ACE genotypes, the mean pre exercise cTnI values of the trained group (DD: 0.015 ± 0.008 ng/mL, ID: 0.015 ± 0.007 ng/mL, and II: 0.014 ± 0.008 ng/mL) and their untrained counterparts (DD: 0.010 ± 0.004 ng/mL, ID: 0.011 ± 0.004 ng/mL, and II: 0.010 ± 0.006 ng/mL) did not show any significant difference. To sum up, noticeable difference in baseline cTnI was observed, which was related to athletic status but not ACE genotypes. Neither athletic status nor ACE genotypes seemed to affect the changes in cardiac biomarkers in response to HIIE and MICE, indicating that the ACE gene does not play a significant role in the release of exercise-induced cardiac biomarkers indicative of cardiac damage in Iranian soccer players.NEW & NOTEWORTHY Our study investigated the impact of athletic status and angiotensin-converting enzyme (ACE) gene I/D polymorphism on cardiac biomarkers in soccer players. Trained players showed higher baseline cardiac troponin I (cTnI) levels, whereas postexercise ΔcTnI remained consistent across groups. N-terminal pro-brain natriuretic peptide increased after exercise in both groups, staying within normal limits. ACE genotypes did not significantly affect pre-exercise cTnI. Overall, athletic status influences baseline cTnI, but neither it nor ACE genotypes significantly impact exercise-induced cardiac biomarker responses in this population.


Subject(s)
Biomarkers , Exercise , Natriuretic Peptide, Brain , Peptide Fragments , Peptidyl-Dipeptidase A , Polymorphism, Genetic , Troponin I , Male , Humans , Peptidyl-Dipeptidase A/genetics , Biomarkers/blood , Natriuretic Peptide, Brain/blood , Natriuretic Peptide, Brain/genetics , Troponin I/blood , Troponin I/genetics , Peptide Fragments/blood , Exercise/physiology , Young Adult , Adult , High-Intensity Interval Training/methods , Soccer/physiology , INDEL Mutation/genetics , Heart/physiology
20.
Mol Ecol ; 33(11): e17364, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38651830

ABSTRACT

Despite receiving significant recent attention, the relevance of structural variation (SV) in driving phenotypic diversity remains understudied, although recent advances in long-read sequencing, bioinformatics and pangenomic approaches have enhanced SV detection. We review the role of SVs in shaping phenotypes in avian model systems, and identify some general patterns in SV type, length and their associated traits. We found that most of the avian SVs so far identified are short indels in chickens, which are frequently associated with changes in body weight and plumage colouration. Overall, we found that relatively short SVs are more frequently detected, likely due to a combination of their prevalence compared to large SVs, and a detection bias, stemming primarily from the widespread use of short-read sequencing and associated analytical methods. SVs most commonly involve non-coding regions, especially introns, and when patterns of inheritance were reported, SVs associated primarily with dominant discrete traits. We summarise several examples of phenotypic convergence across different species, mediated by different SVs in the same or different genes and different types of changes in the same gene that can lead to various phenotypes. Complex rearrangements and supergenes, which can simultaneously affect and link several genes, tend to have pleiotropic phenotypic effects. Additionally, SVs commonly co-occur with single-nucleotide polymorphisms, highlighting the need to consider all types of genetic changes to understand the basis of phenotypic traits. We end by summarising expectations for when long-read technologies become commonly implemented in non-model birds, likely leading to an increase in SV discovery and characterisation. The growing interest in this subject suggests an increase in our understanding of the phenotypic effects of SVs in upcoming years.


Subject(s)
Chickens , Phenotype , Animals , Chickens/genetics , Birds/genetics , Genomic Structural Variation , INDEL Mutation
SELECTION OF CITATIONS
SEARCH DETAIL
...