Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 885
Filter
1.
Orphanet J Rare Dis ; 19(1): 209, 2024 May 21.
Article in English | MEDLINE | ID: mdl-38773661

ABSTRACT

BACKGROUND: Marfan syndrome (MFS) is an autosomal dominant connective tissue disease with wide clinical heterogeneity, and mainly caused by pathogenic variants in fibrillin-1 (FBN1). METHODS: A Chinese 4-generation MFS pedigree with 16 family members was recruited and exome sequencing (ES) was performed in the proband. Transcript analysis (patient RNA and minigene assays) and in silico structural analysis were used to determine the pathogenicity of the variant. In addition, germline mosaicism in family member (Ι:1) was assessed using quantitative fluorescent polymerase chain reaction (QF-PCR) and short tandem repeat PCR (STR) analyses. RESULTS: Two cis-compound benign intronic variants of FBN1 (c.3464-4 A > G and c.3464-5G > A) were identified in the proband by ES. As a compound variant, c.3464-5_3464-4delGAinsAG was found to be pathogenic and co-segregated with MFS. RNA studies indicated that aberrant transcripts were found only in patients and mutant-type clones. The variant c.3464-5_3464-4delGAinsAG caused erroneous integration of a 3 bp sequence into intron 28 and resulted in the insertion of one amino acid in the protein sequence (p.Ile1154_Asp1155insAla). Structural analyses suggested that p.Ile1154_Asp1155insAla affected the protein's secondary structure by interfering with one disulfide bond between Cys1140 and Cys1153 and causing the extension of an anti-parallel ß sheet in the calcium-binding epidermal growth factor-like (cbEGF)13 domain. In addition, the asymptomatic family member Ι:1 was deduced to be a gonadal mosaic as assessed by inconsistent results of sequencing and STR analysis. CONCLUSIONS: To our knowledge, FBN1 c.3464-5_3464-4delGAinsAG is the first identified pathogenic intronic indel variant affecting non-canonical splice sites in this gene. Our study reinforces the importance of assessing the pathogenic role of intronic variants at the mRNA level, with structural analysis, and the occurrence of mosaicism.


Subject(s)
Fibrillin-1 , Introns , Marfan Syndrome , Mosaicism , Pedigree , Humans , Fibrillin-1/genetics , Marfan Syndrome/genetics , Marfan Syndrome/pathology , Female , Male , Adult , Introns/genetics , INDEL Mutation/genetics , Middle Aged , Adipokines
2.
Physiol Genomics ; 56(6): 436-444, 2024 Jun 01.
Article in English | MEDLINE | ID: mdl-38586874

ABSTRACT

This study aimed to investigate the relationship between pre- and postexercise cardiac biomarker release according to athletic status (trained vs. untrained) and to establish whether the I/D polymorphism in the angiotensin-converting enzyme (ACE) gene had an influence on cardiac biomarkers release with specific regard on the influence of the training state. We determined cardiac troponin I (cTnI) and N-terminal pro-brain natriuretic peptide (NT-proBNP) in 29 trained and 27 untrained male soccer players before and after moderate-intensity continuous exercise (MICE) and high-intensity interval exercise (HIIE) running tests. Trained soccer players had higher pre (trained: 0.014 ± 0.007 ng/mL; untrained: 0.010 ± 0.005 ng/mL) and post HIIE (trained: 0.031 ± 0.008 ng/mL; untrained: 0.0179 ± 0.007) and MICE (trained: 0.030 ± 0.007 ng/mL; untrained: 0.018 ± 0.007) cTnI values than untrained subjects, but the change with exercise (ΔcTnI) was similar between groups. There was no significant difference in baseline and postexercise NT-proBNP between groups. NT-proBNP levels were elevated after both HIIE and MICE. Considering three ACE genotypes, the mean pre exercise cTnI values of the trained group (DD: 0.015 ± 0.008 ng/mL, ID: 0.015 ± 0.007 ng/mL, and II: 0.014 ± 0.008 ng/mL) and their untrained counterparts (DD: 0.010 ± 0.004 ng/mL, ID: 0.011 ± 0.004 ng/mL, and II: 0.010 ± 0.006 ng/mL) did not show any significant difference. To sum up, noticeable difference in baseline cTnI was observed, which was related to athletic status but not ACE genotypes. Neither athletic status nor ACE genotypes seemed to affect the changes in cardiac biomarkers in response to HIIE and MICE, indicating that the ACE gene does not play a significant role in the release of exercise-induced cardiac biomarkers indicative of cardiac damage in Iranian soccer players.NEW & NOTEWORTHY Our study investigated the impact of athletic status and angiotensin-converting enzyme (ACE) gene I/D polymorphism on cardiac biomarkers in soccer players. Trained players showed higher baseline cardiac troponin I (cTnI) levels, whereas postexercise ΔcTnI remained consistent across groups. N-terminal pro-brain natriuretic peptide increased after exercise in both groups, staying within normal limits. ACE genotypes did not significantly affect pre-exercise cTnI. Overall, athletic status influences baseline cTnI, but neither it nor ACE genotypes significantly impact exercise-induced cardiac biomarker responses in this population.


Subject(s)
Biomarkers , Exercise , Natriuretic Peptide, Brain , Peptide Fragments , Peptidyl-Dipeptidase A , Polymorphism, Genetic , Troponin I , Male , Humans , Peptidyl-Dipeptidase A/genetics , Biomarkers/blood , Natriuretic Peptide, Brain/blood , Natriuretic Peptide, Brain/genetics , Troponin I/blood , Troponin I/genetics , Peptide Fragments/blood , Exercise/physiology , Young Adult , Adult , High-Intensity Interval Training/methods , Soccer/physiology , INDEL Mutation/genetics , Heart/physiology
3.
Anim Biotechnol ; 35(1): 2337751, 2024 Nov.
Article in English | MEDLINE | ID: mdl-38597900

ABSTRACT

The economic efficiency of sheep breeding, aiming to enhance productivity, is a focal point for improvement of sheep breeding. Recent studies highlight the involvement of the Early Region 2 Binding Factor transcription factor 8 (E2F8) gene in female reproduction. Our group's recent genome-wide association study (GWAS) emphasizes the potential impact of the E2F8 gene on prolificacy traits in Australian White sheep (AUW). Herein, the purpose of this study was to assess the correlation of the E2F8 gene with litter size in AUW sheep breed. This work encompassed 659 AUW sheep, subject to genotyping through PCR-based genotyping technology. Furthermore, the results of PCR-based genotyping showed significant associations between the P1-del-32bp bp InDel and the fourth and fifth parities litter size in AUW sheep; the litter size of those with genotype ID were superior compared to those with DD and II genotypes. Thus, these results indicate that the P1-del-32bp InDel within the E2F8 gene can be useful in marker-assisted selection (MAS) in sheep.


Subject(s)
Genome-Wide Association Study , INDEL Mutation , Female , Animals , Sheep/genetics , Pregnancy , Australia , Litter Size/genetics , Genotype , INDEL Mutation/genetics
4.
CRISPR J ; 7(1): 29-40, 2024 02.
Article in English | MEDLINE | ID: mdl-38353621

ABSTRACT

The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas9 system has been widely used to create animal models for biomedical and agricultural use owing to its low cost and easy handling. However, the occurrence of erroneous cleavage (off-targeting) may raise certain concerns for the practical application of the CRISPR-Cas9 system. In this study, we created a melanocortin 1 receptor (MC1R)-edited pig model through somatic cell nuclear transfer (SCNT) by using porcine kidney cells modified by the CRISPR-Cas9 system. We then carried out whole-genome sequencing of two MC1R-edited pigs and two cloned wild-type siblings, together with the donor cells, to assess the genome-wide presence of single-nucleotide variants and small insertions and deletions (indels) and found only one candidate off-target indel in both MC1R-edited pigs. In summary, our study indicates that the minimal off-targeting effect induced by CRISPR-Cas9 may not be a major concern in gene-edited pigs created by SCNT.


Subject(s)
CRISPR-Cas Systems , Receptor, Melanocortin, Type 1 , Animals , Swine/genetics , Receptor, Melanocortin, Type 1/genetics , CRISPR-Cas Systems/genetics , Gene Editing , Mutation , INDEL Mutation/genetics
5.
Cells ; 13(3)2024 Jan 30.
Article in English | MEDLINE | ID: mdl-38334653

ABSTRACT

Successful genome editing depends on the cleavage efficiency of programmable nucleases (PNs) such as the CRISPR-Cas system. Various methods have been developed to assess the efficiency of PNs, most of which estimate the occurrence of indels caused by PN-induced double-strand breaks. In these methods, PN genomic target sites are amplified through PCR, and the resulting PCR products are subsequently analyzed using Sanger sequencing, high-throughput sequencing, or mismatch detection assays. Among these methods, Sanger sequencing of PCR products followed by indel analysis using online web tools has gained popularity due to its user-friendly nature. This approach estimates indel frequencies by computationally analyzing sequencing trace data. However, the accuracy of these computational tools remains uncertain. In this study, we compared the performance of four web tools, TIDE, ICE, DECODR, and SeqScreener, using artificial sequencing templates with predetermined indels. Our results demonstrated that these tools were able to estimate indel frequency with acceptable accuracy when the indels were simple and contained only a few base changes. However, the estimated values became more variable among the tools when the sequencing templates contained more complex indels or knock-in sequences. Moreover, although these tools effectively estimated the net indel sizes, their capability to deconvolute indel sequences exhibited variability with certain limitations. These findings underscore the importance of judiciously selecting and using an appropriate tool with caution, depending on the type of genome editing being performed.


Subject(s)
CRISPR-Cas Systems , Gene Editing , Gene Editing/methods , CRISPR-Cas Systems/genetics , INDEL Mutation/genetics , Genome/genetics , Genomics
6.
Nature ; 627(8004): 586-593, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38355797

ABSTRACT

Over half of hepatocellular carcinoma (HCC) cases diagnosed worldwide are in China1-3. However, whole-genome analysis of hepatitis B virus (HBV)-associated HCC in Chinese individuals is limited4-8, with current analyses of HCC mainly from non-HBV-enriched populations9,10. Here we initiated the Chinese Liver Cancer Atlas (CLCA) project and performed deep whole-genome sequencing (average depth, 120×) of 494 HCC tumours. We identified 6 coding and 28 non-coding previously undescribed driver candidates. Five previously undescribed mutational signatures were found, including aristolochic-acid-associated indel and doublet base signatures, and a single-base-substitution signature that we termed SBS_H8. Pentanucleotide context analysis and experimental validation confirmed that SBS_H8 was distinct to the aristolochic-acid-associated SBS22. Notably, HBV integrations could take the form of extrachromosomal circular DNA, resulting in elevated copy numbers and gene expression. Our high-depth data also enabled us to characterize subclonal clustered alterations, including chromothripsis, chromoplexy and kataegis, suggesting that these catastrophic events could also occur in late stages of hepatocarcinogenesis. Pathway analysis of all classes of alterations further linked non-coding mutations to dysregulation of liver metabolism. Finally, we performed in vitro and in vivo assays to show that fibrinogen alpha chain (FGA), determined as both a candidate coding and non-coding driver, regulates HCC progression and metastasis. Our CLCA study depicts a detailed genomic landscape and evolutionary history of HCC in Chinese individuals, providing important clinical implications.


Subject(s)
Carcinoma, Hepatocellular , Genome, Human , High-Throughput Nucleotide Sequencing , Liver Neoplasms , Mutation , Whole Genome Sequencing , Humans , Aristolochic Acids/metabolism , Carcinogenesis , Carcinoma, Hepatocellular/genetics , Carcinoma, Hepatocellular/virology , China , Chromothripsis , Disease Progression , DNA, Circular/genetics , East Asian People/genetics , Evolution, Molecular , Genome, Human/genetics , Hepatitis B virus/genetics , INDEL Mutation/genetics , Liver/metabolism , Liver Neoplasms/genetics , Liver Neoplasms/virology , Mutation/genetics , Neoplasm Metastasis/genetics , Open Reading Frames/genetics , Reproducibility of Results
7.
Nature ; 624(7992): 602-610, 2023 Dec.
Article in English | MEDLINE | ID: mdl-38093003

ABSTRACT

Indigenous Australians harbour rich and unique genomic diversity. However, Aboriginal and Torres Strait Islander ancestries are historically under-represented in genomics research and almost completely missing from reference datasets1-3. Addressing this representation gap is critical, both to advance our understanding of global human genomic diversity and as a prerequisite for ensuring equitable outcomes in genomic medicine. Here we apply population-scale whole-genome long-read sequencing4 to profile genomic structural variation across four remote Indigenous communities. We uncover an abundance of large insertion-deletion variants (20-49 bp; n = 136,797), structural variants (50 b-50 kb; n = 159,912) and regions of variable copy number (>50 kb; n = 156). The majority of variants are composed of tandem repeat or interspersed mobile element sequences (up to 90%) and have not been previously annotated (up to 62%). A large fraction of structural variants appear to be exclusive to Indigenous Australians (12% lower-bound estimate) and most of these are found in only a single community, underscoring the need for broad and deep sampling to achieve a comprehensive catalogue of genomic structural variation across the Australian continent. Finally, we explore short tandem repeats throughout the genome to characterize allelic diversity at 50 known disease loci5, uncover hundreds of novel repeat expansion sites within protein-coding genes, and identify unique patterns of diversity and constraint among short tandem repeat sequences. Our study sheds new light on the dimensions and dynamics of genomic structural variation within and beyond Australia.


Subject(s)
Australian Aboriginal and Torres Strait Islander Peoples , Genome, Human , Genomic Structural Variation , Humans , Alleles , Australia/ethnology , Australian Aboriginal and Torres Strait Islander Peoples/genetics , Datasets as Topic , DNA Copy Number Variations/genetics , Genetic Loci/genetics , Genetics, Medical , Genomic Structural Variation/genetics , Genomics , INDEL Mutation/genetics , Interspersed Repetitive Sequences/genetics , Microsatellite Repeats/genetics , Genome, Human/genetics
8.
Article in English | MEDLINE | ID: mdl-38058963

ABSTRACT

Introduction: Research shows the correlation between angiotensin-converting enzyme (ACE) deletion and insertion (D/I) polymorphism and COVID-19 risk; yet, conclusive evidence is still lacking. Thus, a meta-analysis of relevant articles was performed to more accurately estimate the relationship of ACE I/D polymorphism with the risk of COVID-19. Material and Methods. Relevant literature from the PubMed database was systematically reviewed, and odds ratios (ORs) and associated 95% confidence intervals (CIs) were measured. Additionally, the metapackage from Stata version 15.0 was used for statistical analysis. Results: The meta-analysis eventually contained 8 studies, including 1362 COVID-19 cases and 4312 controls. Based on the data, the ACE I/D polymorphism did not show an association with COVID-19 risk (D vs. I: OR = 1.25, 95% CI = 0.96-1.64; DD vs. II: OR = 1.89, 95% CI = 0.95-3.74; DI vs. II: OR = 1.75, 95% CI = 0.92-3.31; dominant model: OR = 1.88, 95% CI = 0.99-3.53; and recessive model: OR = 1.24, 95% CI = 0.81-1.90). Further, subgroup analyses stratified based on case proved that the ACE D allele demonstrated an association with increasing risk of COVID-19 severity (D vs. I: OR = 1.64, 95% CI = 1.01-2.66; DD vs. II: OR = 4.62, 95% CI = 2.57-8.30; DI vs. II: OR = 3.07, 95% CI = 1.75-5.38; dominant model: OR = 3.74, 95% CI = 2.15-6.50; and recessive model: OR = 1.28, 95% CI = 0.46-3.51). Conclusions: The ACE D allele was clearly related to an enhanced risk of COVID-19 severity. Hence, it is imperative to take into account the influence of genetic factors during the development of future vaccines.


Subject(s)
COVID-19 , Genetic Predisposition to Disease , Polymorphism, Genetic , Humans , COVID-19/genetics , INDEL Mutation/genetics , Peptidyl-Dipeptidase A/genetics , Risk Factors
9.
PLoS Comput Biol ; 19(8): e1010727, 2023 08.
Article in English | MEDLINE | ID: mdl-37566612

ABSTRACT

The sequence contexts of genomic variants play important roles in understanding biological significances of variants and potential sequencing related variant calling issues. However, methods for assessing the diverse sequence contexts of genomic variants such as tandem repeats and unambiguous annotations have been limited. Herein, we describe the Variant Sequence Context Annotation Tool (VarSCAT) for annotating the sequence contexts of genomic variants, including breakpoint ambiguities, flanking bases of variants, wildtype/mutated DNA sequences, variant nomenclatures, distances between adjacent variants, tandem repeat regions, and custom annotation with user customizable options. Our analyses demonstrate that VarSCAT is more versatile and customizable than the currently available methods or strategies for annotating variants in short tandem repeat (STR) regions or insertions and deletions (indels) with breakpoint ambiguity. Variant sequence context annotations of high-confidence human variant sets with VarSCAT revealed that more than 75% of all human individual germline and clinically relevant indels have breakpoint ambiguities. Moreover, we illustrate that more than 80% of human individual germline small variants in STR regions are indels and that the sizes of these indels correlated with STR motif sizes. VarSCAT is available from https://github.com/elolab/VarSCAT.


Subject(s)
Genomics , INDEL Mutation , Humans , INDEL Mutation/genetics , Genomics/methods , Software , High-Throughput Nucleotide Sequencing
11.
J Cell Sci ; 136(6)2023 03 15.
Article in English | MEDLINE | ID: mdl-36762651

ABSTRACT

The advance of CRISPR/Cas9 technology has enabled us easily to generate gene knockout cell lines by introducing insertion-deletion mutations (indels) at the target site via the error-prone non-homologous end joining repair system. Frameshift-promoting indels can disrupt gene functions by generation of a premature stop codon. However, there is growing evidence that targeted genes are not always knocked out by the indel-based gene disruption. Here, we established a pipeline of CRISPR-del, which induces a large chromosomal deletion by cutting two different target sites, to perform 'complete' gene knockout efficiently in human diploid cells. Quantitative analyses show that the frequency of gene deletion with this approach is much higher than that of conventional CRISPR-del methods. The lengths of the deleted genomic regions demonstrated in this study are longer than those of 95% of the human protein-coding genes. Furthermore, the pipeline enabled the generation of a model cell line having a bi-allelic cancer-associated chromosomal deletion. Overall, these data lead us to propose that the CRISPR-del pipeline is an efficient and practical approach for producing 'complete' gene knockout cell lines in human diploid cells.


Subject(s)
CRISPR-Cas Systems , Diploidy , Humans , Gene Knockout Techniques , CRISPR-Cas Systems/genetics , INDEL Mutation/genetics , Cell Line , Gene Editing/methods
12.
Anim Biotechnol ; 34(7): 2175-2182, 2023 Dec.
Article in English | MEDLINE | ID: mdl-35622416

ABSTRACT

RAR related orphan receptor A (RORA), which encodes the retinoid-acid-related orphan receptor alpha (RORα), is a clock gene found in skeletal muscle. Several studies have shown that RORα plays an important role in bone formation, suggesting that RORA gene may take part in the regulation of growth and development. The purpose of this research is to study the insertion/deletion (indel) variations of the RORA gene and investigate the relationship with the growth traits of Shaanbei white cashmere (SBWC) goats. Herein, the current study identified that the P4-11-bp and P11-28-bp deletion sites are polymorphic among 12 pairs of primers within the RORA gene in the SBWC goats (n = 641). Moreover, the P11-28-bp deletion locus was significantly related to the body height (p = 0.046), height at hip cross (p = 0.012), and body length (p = 0.003). Both of P4-11-bp and P11-28-bp indels showed the moderate genetic diversity (0.25

Subject(s)
Goats , INDEL Mutation , Pregnancy , Female , Animals , Litter Size/genetics , Goats/physiology , INDEL Mutation/genetics , Phenotype
13.
IEEE/ACM Trans Comput Biol Bioinform ; 20(3): 1628-1640, 2023.
Article in English | MEDLINE | ID: mdl-36260590

ABSTRACT

Recent works on genome rearrangements have shown that incorporating intergenic region information along with gene order in models provides better estimations for the rearrangement distance than using gene order alone. The reversal distance is one of the main problems in genome rearrangements. It has a polynomial time algorithm when only gene order is used to model genomes, assuming that repeated genes do not exist and that gene orientation is known, even when the genomes have distinct gene sets. The reversal distance is NP-hard and has a 2-approximation algorithm when incorporating intergenic regions. However, the problem has only been studied assuming genomes with the same set of genes. In this work, we consider the variation that incorporates intergenic regions and that allows genomes to have distinct sets of genes, a scenario that leads us to include indels operations (insertions and deletions). We present a 2.5-approximation algorithm using the labeled intergenic breakpoint graph, which is based on the well-known breakpoint graph structure. We also present an experimental analysis of the proposed algorithm using simulated data, which showed that the practical approximation factor is considerably less than 2.5. Furthermore, we used the algorithm in real genomes to construct a phylogenetic tree.


Subject(s)
Genome , Models, Genetic , Phylogeny , INDEL Mutation/genetics , Gene Rearrangement , Algorithms
14.
Syst Biol ; 72(2): 307-318, 2023 Jun 16.
Article in English | MEDLINE | ID: mdl-35866991

ABSTRACT

Modern phylogenetic methods allow inference of ancestral molecular sequences given an alignment and phylogeny relating present-day sequences. This provides insight into the evolutionary history of molecules, helping to understand gene function and to study biological processes such as adaptation and convergent evolution across a variety of applications. Here, we propose a dynamic programming algorithm for fast joint likelihood-based reconstruction of ancestral sequences under the Poisson Indel Process (PIP). Unlike previous approaches, our method, named ARPIP, enables the reconstruction with insertions and deletions based on an explicit indel model. Consequently, inferred indel events have an explicit biological interpretation. Likelihood computation is achieved in linear time with respect to the number of sequences. Our method consists of two steps, namely finding the most probable indel points and reconstructing ancestral sequences. First, we find the most likely indel points and prune the phylogeny to reflect the insertion and deletion events per site. Second, we infer the ancestral states on the pruned subtree in a manner similar to FastML. We applied ARPIP (Ancestral Reconstruction under PIP) on simulated data sets and on real data from the Betacoronavirus genus. ARPIP reconstructs both the indel events and substitutions with a high degree of accuracy. Our method fares well when compared to established state-of-the-art methods such as FastML and PAML. Moreover, the method can be extended to explore both optimal and suboptimal reconstructions, include rate heterogeneity through time and more. We believe it will expand the range of novel applications of ancestral sequence reconstruction. [Ancestral sequences; dynamic programming; evolutionary stochastic process; indel; joint ancestral sequence reconstruction; maximum likelihood; Poisson Indel Process; phylogeny; SARS-CoV.].


Subject(s)
Algorithms , INDEL Mutation , Phylogeny , Likelihood Functions , Sequence Alignment , INDEL Mutation/genetics , Evolution, Molecular
15.
Anim Biotechnol ; 34(7): 2674-2683, 2023 Dec.
Article in English | MEDLINE | ID: mdl-35980330

ABSTRACT

Calsyntenin-2 (CLSTN2) is involved in cell proliferation, differentiation, cell death, tumorigenesis, and follicular expression. Although CLSTN2 has been identified as a potential candidate gene for sheep prolificacy, no studies have been done on its effect on goat prolificacy. The purpose of this study was to identify mRNA expression and genetic variation within goat CLSTN2, and its association with prolificacy. Herein, we uncovered significant differences in mRNA levels of the CLSTN2 gene in different tissues in female goats (p < 0.01), including ovary tissue. Nine putative indels were designed to investigate their correlation to litter size, but only one 16-bp deletion was discovered in female Shaanbei white cashmere goats (n = 902). We discovered that a 16-bp deletion within the CLSTN2 gene was significantly correlated with first-born litter size (p = 0.0001). As shown by the chi-squared test, the genotypic II of single-lambs and multi-lambs was dramatically higher than with genotype ID (p = 0.005). Our findings suggest that indel within the CLSTN2 gene is a candidate gene affecting prolificacy in goats and may be used for Marker Assisted Selection (MAS) in goats.


Subject(s)
Goats , INDEL Mutation , Pregnancy , Animals , Female , Sheep/genetics , Litter Size/genetics , Goats/genetics , Genotype , INDEL Mutation/genetics , RNA, Messenger
16.
PLoS Comput Biol ; 18(10): e1010633, 2022 10.
Article in English | MEDLINE | ID: mdl-36279274

ABSTRACT

Ancestral sequence reconstruction is a technique that is gaining widespread use in molecular evolution studies and protein engineering. Accurate reconstruction requires the ability to handle appropriately large numbers of sequences, as well as insertion and deletion (indel) events, but available approaches exhibit limitations. To address these limitations, we developed Graphical Representation of Ancestral Sequence Predictions (GRASP), which efficiently implements maximum likelihood methods to enable the inference of ancestors of families with more than 10,000 members. GRASP implements partial order graphs (POGs) to represent and infer insertion and deletion events across ancestors, enabling the identification of building blocks for protein engineering. To validate the capacity to engineer novel proteins from realistic data, we predicted ancestor sequences across three distinct enzyme families: glucose-methanol-choline (GMC) oxidoreductases, cytochromes P450, and dihydroxy/sugar acid dehydratases (DHAD). All tested ancestors demonstrated enzymatic activity. Our study demonstrates the ability of GRASP (1) to support large data sets over 10,000 sequences and (2) to employ insertions and deletions to identify building blocks for engineering biologically active ancestors, by exploring variation over evolutionary time.


Subject(s)
Evolution, Molecular , INDEL Mutation , INDEL Mutation/genetics , Proteins/genetics , Biological Evolution , Phylogeny
17.
Plant Cell Rep ; 41(12): 2279-2292, 2022 Dec.
Article in English | MEDLINE | ID: mdl-36209436

ABSTRACT

KEY MESSAGE: Genome resequencing uncovers genome-wide DNA polymorphisms that are useful for the development of high-density InDel markers between two barley cultivars. Discovering genomic variations and developing genetic markers are crucial for genetics studies and molecular breeding in cereal crops. Although InDels (insertions and deletions) have become popular because of their abundance and ease of detection, discovery of genome-wide DNA polymorphisms and development of InDel markers in barley have lagged behind other cereal crops such as rice, maize and wheat. In this study, we re-sequenced two barley cultivars, Golden Promise (GP, a classic British spring barley variety) and Hua30 (a Chinese spring barley variety), and mapped clean reads to the reference Morex genome, and identified in total 13,933,145 single nucleotide polymorphisms (SNPs) and 1,240,456 InDels for GP with Morex, 11,297,100 SNPs and 781,687 InDels for Hua30 with Morex, and 13,742,399 SNPs and 1,191,597 InDels for GP with Hua30. We further characterized distinct types, chromosomal distribution patterns, genome location, functional effect, and other features of these DNA polymorphisms. Additionally, we revealed the functional relevance of these identified SNPs/InDels regarding different flowering times between Hua30 and GP within 17 flowering time genes. Furthermore, we developed a series of InDel markers and validated them experimentally in 43 barley core accessions, respectively. Finally, we rebuilt population structure and phylogenetic tree of these 43 barley core accessions. Collectively, all of these genetic resources will facilitate not only the basic research but also applied research in barley.


Subject(s)
Hordeum , Hordeum/genetics , Genome, Plant/genetics , Phylogeny , INDEL Mutation/genetics , Polymorphism, Single Nucleotide/genetics , DNA
18.
Nat Genet ; 54(10): 1564-1571, 2022 10.
Article in English | MEDLINE | ID: mdl-36163278

ABSTRACT

Accurate somatic mutation detection from single-cell DNA sequencing is challenging due to amplification-related artifacts. To reduce this artifact burden, an improved amplification technique, primary template-directed amplification (PTA), was recently introduced. We analyzed whole-genome sequencing data from 52 PTA-amplified single neurons using SCAN2, a new genotyper we developed to leverage mutation signatures and allele balance in identifying somatic single-nucleotide variants (SNVs) and small insertions and deletions (indels) in PTA data. Our analysis confirms an increase in nonclonal somatic mutation in single neurons with age, but revises the estimated rate of this accumulation to 16 SNVs per year. We also identify artifacts in other amplification methods. Most importantly, we show that somatic indels increase by at least three per year per neuron and are enriched in functional regions of the genome such as enhancers and promoters. Our data suggest that indels in gene-regulatory elements have a considerable effect on genome integrity in human neurons.


Subject(s)
High-Throughput Nucleotide Sequencing , Point Mutation , Genome, Human/genetics , High-Throughput Nucleotide Sequencing/methods , Humans , INDEL Mutation/genetics , Neurons , Nucleotides , Polymorphism, Single Nucleotide/genetics , Single-Cell Analysis
19.
PLoS Comput Biol ; 18(8): e1010303, 2022 08.
Article in English | MEDLINE | ID: mdl-35939516

ABSTRACT

Most methods for phylogenetic tree reconstruction are based on sequence alignments; they infer phylogenies from substitutions that may have occurred at the aligned sequence positions. Gaps in alignments are usually not employed as phylogenetic signal. In this paper, we explore an alignment-free approach that uses insertions and deletions (indels) as an additional source of information for phylogeny inference. For a set of four or more input sequences, we generate so-called quartet blocks of four putative homologous segments each. For pairs of such quartet blocks involving the same four sequences, we compare the distances between the two blocks in these sequences, to obtain hints about indels that may have happened between the blocks since the respective four sequences have evolved from their last common ancestor. A prototype implementation that we call Gap-SpaM is presented to infer phylogenetic trees from these data, using a quartet-tree approach or, alternatively, under the maximum-parsimony paradigm. This approach should not be regarded as an alternative to established methods, but rather as a complementary source of phylogenetic information. Interestingly, however, our software is able to produce phylogenetic trees from putative indels alone that are comparable to trees obtained with existing alignment-free methods.


Subject(s)
INDEL Mutation , Software , Algorithms , INDEL Mutation/genetics , Phylogeny , Sequence Alignment
20.
Electrophoresis ; 43(18-19): 1871-1881, 2022 10.
Article in English | MEDLINE | ID: mdl-35859229

ABSTRACT

Marker sets based on insertion/deletion polymorphisms (InDels) combine the characteristics of both short tandem repeats (STRs) and single nucleotide polymorphisms and have served as effective complementary or stand-alone systems for human identification in forensics. We developed a novel multiplex amplification detection system, designated the AGCU InDel 60 kit, containing 57 autosomal InDels, 2 Y-chromosomal InDels, and the amelogenin locus and validated the kit in a series of studies, which included tests of the PCR conditions; tests for sensitivity, species specificity, reproducibility, stability, and mock case samples; degradation studies; and a population study. The results indicated that the AGCU InDel 60 kit was accurate, specific, reproducible, stable, and robust. Complete DNA profiles were obtained even with 125 pg of human DNA. In tests of artificially degraded samples, we found that the number of alleles detected by the validated kit was considerably greater than that detected by the STR-based AGCU 21+1 kit, even as the degree of degradation increased. Additionally, 564 unrelated individuals from three Han groups were investigated using this novel system, and the values of combined power of discrimination and combined power of exclusion were not less than 1-4.9026 × 10-24 and 1-3.1123 × 10-5 , respectively. Thus, the results indicated that the novel kit was more powerful than the previous version of the InDel kit (the AGCU InDel 50 kit). Our results suggest that the AGCU InDel 60 kit can serve as an efficient tool for human forensics and a supplementary kit for population genetics research.


Subject(s)
DNA Fingerprinting , INDEL Mutation , Amelogenin/genetics , DNA , DNA Fingerprinting/methods , Forensic Genetics , Gene Frequency , Genetics, Population , Humans , INDEL Mutation/genetics , Microsatellite Repeats/genetics , Polymorphism, Single Nucleotide/genetics , Reproducibility of Results
SELECTION OF CITATIONS
SEARCH DETAIL
...