Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 7 de 7
Filter
Add more filters










Database
Language
Publication year range
1.
Genome Biol ; 24(1): 270, 2023 Nov 27.
Article in English | MEDLINE | ID: mdl-38012772

ABSTRACT

BACKGROUND: Genomic DNA reference materials are widely recognized as essential for ensuring data quality in omics research. However, relying solely on reference datasets to evaluate the accuracy of variant calling results is incomplete, as they are limited to benchmark regions. Therefore, it is important to develop DNA reference materials that enable the assessment of variant detection performance across the entire genome. RESULTS: We established a DNA reference material suite from four immortalized cell lines derived from a family of parents and monozygotic twins. Comprehensive reference datasets of 4.2 million small variants and 15,000 structural variants were integrated and certified for evaluating the reliability of germline variant calls inside the benchmark regions. Importantly, the genetic built-in-truth of the Quartet family design enables estimation of the precision of variant calls outside the benchmark regions. Using the Quartet reference materials along with study samples, batch effects are objectively monitored and alleviated by training a machine learning model with the Quartet reference datasets to remove potential artifact calls. Moreover, the matched RNA and protein reference materials and datasets from the Quartet project enables cross-omics validation of variant calls from multiomics data. CONCLUSIONS: The Quartet DNA reference materials and reference datasets provide a unique resource for objectively assessing the quality of germline variant calls throughout the whole-genome regions and improving the reliability of large-scale genomic profiling.


Subject(s)
Benchmarking , Genome, Human , Humans , Reproducibility of Results , Polymorphism, Single Nucleotide , Germ Cells , High-Throughput Nucleotide Sequencing/methods
2.
BMC Genomics ; 23(1): 324, 2022 Apr 23.
Article in English | MEDLINE | ID: mdl-35461238

ABSTRACT

BACKGROUND: Structural variants (SVs) play a crucial role in gene regulation, trait association, and disease in humans. SV genotyping has been extensively applied in genomics research and clinical diagnosis. Although a growing number of SV genotyping methods for long reads have been developed, a comprehensive performance assessment of these methods has yet to be done. RESULTS: Based on one simulated and three real SV datasets, we performed an in-depth evaluation of five SV genotyping methods, including cuteSV, LRcaller, Sniffles, SVJedi, and VaPoR. The results show that for insertions and deletions, cuteSV and LRcaller have similar F1 scores (cuteSV, insertions: 0.69-0.90, deletions: 0.77-0.90 and LRcaller, insertions: 0.67-0.87, deletions: 0.74-0.91) and are superior to other methods. For duplications, inversions, and translocations, LRcaller yields the most accurate genotyping results (0.84, 0.68, and 0.47, respectively). When genotyping SVs located in tandem repeat region or with imprecise breakpoints, cuteSV (insertions and deletions) and LRcaller (duplications, inversions, and translocations) are better than other methods. In addition, we observed a decrease in F1 scores when the SV size increased. Finally, our analyses suggest that the F1 scores of these methods reach the point of diminishing returns at 20× depth of coverage. CONCLUSIONS: We present an in-depth benchmark study of long-read SV genotyping methods. Our results highlight the advantages and disadvantages of each genotyping method, which provide practical guidance for optimal application selection and prospective directions for tool improvement.


Subject(s)
Genomics , High-Throughput Nucleotide Sequencing , Genomic Structural Variation , Genomics/methods , Genotype , Genotyping Techniques , Humans , Prospective Studies , Sequence Analysis, DNA/methods
3.
Gigascience ; 8(7)2019 07 01.
Article in English | MEDLINE | ID: mdl-31289832

ABSTRACT

BACKGROUND: The blood clam, Scapharca (Anadara) broughtonii, is an economically and ecologically important marine bivalve of the family Arcidae. Efforts to study their population genetics, breeding, cultivation, and stock enrichment have been somewhat hindered by the lack of a reference genome. Herein, we report the complete genome sequence of S. broughtonii, a first reference genome of the family Arcidae. FINDINGS: A total of 75.79 Gb clean data were generated with the Pacific Biosciences and Oxford Nanopore platforms, which represented approximately 86× coverage of the S. broughtonii genome. De novo assembly of these long reads resulted in an 884.5-Mb genome, with a contig N50 of 1.80 Mb and scaffold N50 of 45.00 Mb. Genome Hi-C scaffolding resulted in 19 chromosomes containing 99.35% of bases in the assembled genome. Genome annotation revealed that nearly half of the genome (46.1%) is composed of repeated sequences, while 24,045 protein-coding genes were predicted and 84.7% of them were annotated. CONCLUSIONS: We report here a chromosomal-level assembly of the S. broughtonii genome based on long-read sequencing and Hi-C scaffolding. The genomic data can serve as a reference for the family Arcidae and will provide a valuable resource for the scientific community and aquaculture sector.


Subject(s)
Bivalvia/genetics , Chromosomes/genetics , Genome , Animals , Contig Mapping , Molecular Sequence Annotation , Whole Genome Sequencing
4.
J Exp Bot ; 69(18): 4403-4417, 2018 08 14.
Article in English | MEDLINE | ID: mdl-29860476

ABSTRACT

Arabidopsis Senescence-Associated Subtilisin Protease (SASP) has previously been reported to participate in leaf senescence and in the development of inflorescences and siliques. Here, we describe a new role of SASP in the regulation of abscisic acid (ABA) signaling. SASP encodes a subtilase and its expression was considerably induced by darkness, ABA, and ethylene treatments. sasp knockout mutants displayed obvious developmental phenotypes such as early flowering and smaller leaves. In particular, the sasp mutants exhibited enhanced ABA sensitivity during seed germination and seedling growth, heightened ABA-mediated leaf senescence, and increased production of reactive oxygen species (ROS). Importantly, the sasp mutants also showed remarkably increased tolerance to drought, with expression of six ABA signaling-related genes being either up- or down-regulated following ABA treatment. Interaction assays demonstrated that SASP physically interacts with OPEN STOMATA 1 (OST1) at the cell periphery. Co-expression of SASP and OST1 led to degradation of OST1, whereas this degradation was impaired in sasp-1 protoplasts. ROS attenuation assays demonstrated that in sasp-1 mutant guard cells the attenuation rate markedly decreased. Taken together, the results demonstrate that SASP plays an important role in regulating ABA signaling and drought tolerance through interaction with OST1.


Subject(s)
Abscisic Acid/metabolism , Arabidopsis Proteins/genetics , Arabidopsis/physiology , Droughts , Protein Kinases/genetics , Signal Transduction/genetics , Subtilisins/genetics , Arabidopsis/genetics , Arabidopsis Proteins/metabolism , Protein Kinases/metabolism , Subtilisins/metabolism
5.
Plant Mol Biol ; 96(6): 563-575, 2018 Apr.
Article in English | MEDLINE | ID: mdl-29525832

ABSTRACT

KEY MESSAGE: The relationships between transcription and methylation were revealed in Arabidopsis thaliana NB-LRR-encoding genes in wild type (Col-0) and different mutants. Plant nucleotide-binding, leucine-rich repeat (NB-LRR) proteins constitute a large family that plays predominant roles in disease resistance. However, the regulation of NB-LRR-encoding genes at the transcriptional level is still poorly understood. Recently, DNA cytosine methylation in eukaryotes has been described as serving an important function in regulating gene expression. Here, we analysed the DNA methylation patterns of NB-LRR-encoding genes in Arabidopsis thaliana in samples from a wild type (Col-0) and ago4, met1, cmt3, drm1/2, and ddm1 mutants. Our results revealed that the vast majority of the NB-LRR-encoding genes in Col-0 were methylated, and the DNA methylation occurred predominantly in the CG sequence context. Moreover, DNA methylation was widely distributed in both the promoters and the bodies of most NB-LRR-encoding genes. Our results also showed that the loss of AGO4, MET1, CMT3, DRM1/2 or DDM1 functions generally led to decreased cytosine methylation in the NB-LRR-encoding genes. Analysis of the available transcriptome data from the wild type and the met1, cmt3, drm1/2 and ddm1 mutants revealed that differences in the transcription levels between the wild type and mutants were statistically significant for 63 of the NB-LRR-encoding genes. Of these genes, 38 were significantly upregulated, and the other 25 were significantly downregulated. Some NB-LRR-encoding genes with differential expression levels, which were revealed by the mRNA-Seq data, were confirmed to be significantly upregulated or downregulated in the mutants compared to the wild type by using quantitative RT-PCR. These data suggest that some Arabidopsis NB-LRR-encoding genes are likely to be regulated by altered DNA methylation patterns.


Subject(s)
Arabidopsis Proteins/genetics , Arabidopsis/genetics , DNA Methylation , Gene Expression Regulation, Plant , Proteins/genetics , 5' Untranslated Regions/genetics , Cluster Analysis , Gene Expression Profiling/methods , Introns/genetics , Leucine-Rich Repeat Proteins , Mutation , Promoter Regions, Genetic/genetics
6.
Article in English | MEDLINE | ID: mdl-24438240

ABSTRACT

Parabramis pekinensis strenosoma belongs to the family Cyprinidae. In the present study, we obtain the complete mitochondrial genome of P. pekinensis strenosoma by PCR amplification and DNA sequencing. It is a circular double-stranded DNA molecule of 16,623 base pairs in length, consisting of the typical structure of 22 transfer RNA genes, 13 protein-coding genes, 2 ribosomal RNA genes, as well as 2 main non-coding regions (the control region and the origin of the light strand replication). Compared with Parabramis pekinensis, the two different subspecies share 99.58% nucleotide sequence similarity and the biggest nucleotide sequence discrepancy between homologous genes are observed in ND2 for protein-coding genes and in tRNA-Ala for tRNA-coding genes. The complete mitochondrial genome sequence data are of great use for phylogenetic analysis and studies of population genetics and germplasm resources of P. pekinensis strenosoma.


Subject(s)
Cyprinidae/genetics , Genome, Mitochondrial/genetics , Sequence Analysis, DNA , Animals , Genes, rRNA , Molecular Sequence Annotation , Molecular Sequence Data , Open Reading Frames/genetics , RNA, Transfer/genetics
7.
Int J Mol Sci ; 16(6): 11996-2013, 2015 May 26.
Article in English | MEDLINE | ID: mdl-26016504

ABSTRACT

Blunt snout bream (Megalobrama amblycephala) is an important fish species for its delicacy and high economic value in China. Codon usage analysis could be helpful to understand its codon biology, mRNA translation and vertebrate evolution. Based on RNA-Seq data for M. amblycephala, high-frequency codons (CUG, AGA, GUG, CAG and GAG), as well as low-frequency ones (NUA and NCG codons) were identified. A total of 724 high-frequency codon pairs were observed. Meanwhile, 14 preferred and 199 avoided neighboring codon pairs were also identified, but bias was almost not shown with one or more intervening codons inserted between the same pairs. Codon usage bias in the regions close to start and stop codons indicated apparent heterogeneity, which even occurs in the flanking nucleotide sequence. Codon usage bias (RSCU and SCUO) was related to GC3 (GC content of 3rd nucleotide in codon) bias. Six GO (Gene ontology) categories and the number of methylation targets were influenced by GC3. Codon usage patterns comparison among 23 vertebrates showed species specificities by using GC contents, codon usage and codon context analysis. This work provided new insights into fish biology and new information for breeding projects.


Subject(s)
Codon , Cyprinidae/genetics , Sequence Analysis, RNA/methods , Animals , Cluster Analysis , Evolution, Molecular , Open Reading Frames
SELECTION OF CITATIONS
SEARCH DETAIL
...