Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 119
Filter
Add more filters










Publication year range
1.
Sci Data ; 11(1): 540, 2024 May 25.
Article in English | MEDLINE | ID: mdl-38796485

ABSTRACT

Amongst fishes, zebrafish (Danio rerio) has gained popularity as a model system over most other species and while their value as a model is well documented, their usefulness is limited in certain fields of research such as behavior. By embracing other, less conventional experimental organisms, opportunities arise to gain broader insights into evolution and development, as well as studying behavioral aspects not available in current popular model systems. The anabantoid paradise fish (Macropodus opercularis), an "air-breather" species has a highly complex behavioral repertoire and has been the subject of many ethological investigations but lacks genomic resources. Here we report the reference genome assembly of M. opercularis using long-read sequences at 150-fold coverage. The final assembly consisted of 483,077,705 base pairs (~483 Mb) on 152 contigs. Within the assembled genome we identified and annotated 20,157 protein coding genes and assigned ~90% of them to orthogroups.


Subject(s)
Fishes , Genome , Animals , Fishes/genetics
2.
Res Sq ; 2024 Apr 03.
Article in English | MEDLINE | ID: mdl-38712074

ABSTRACT

Reference genomes of cattle and sheep have lacked contiguous assemblies of the sex-determining Y chromosome. We assembled complete and gapless telomere to telomere (T2T) Y chromosomes for these species. The pseudo-autosomal regions were similar in length, but the total chromosome size was substantially different, with the cattle Y more than twice the length of the sheep Y. The length disparity was accounted for by expanded ampliconic region in cattle. The genic amplification in cattle contrasts with pseudogenization in sheep suggesting opposite evolutionary mechanisms since their divergence 18MYA. The centromeres also differed dramatically despite the close relationship between these species at the overall genome sequence level. These Y chromosome have been added to the current reference assemblies in GenBank opening new opportunities for the study of evolution and variation while supporting efforts to improve sustainability in these important livestock species that generally use sire-driven genetic improvement strategies.

3.
Nat Methods ; 21(6): 967-970, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38730258

ABSTRACT

Despite advances in long-read sequencing technologies, constructing a near telomere-to-telomere assembly is still computationally demanding. Here we present hifiasm (UL), an efficient de novo assembly algorithm combining multiple sequencing technologies to scale up population-wide near telomere-to-telomere assemblies. Applied to 22 human and two plant genomes, our algorithm produces better diploid assemblies at a cost of an order of magnitude lower than existing methods, and it also works with polyploid genomes.


Subject(s)
Algorithms , Diploidy , Polyploidy , Telomere , Humans , Telomere/genetics , Genome, Plant , Genome, Human , Sequence Analysis, DNA/methods , High-Throughput Nucleotide Sequencing/methods
4.
Nature ; 629(8010): 136-145, 2024 May.
Article in English | MEDLINE | ID: mdl-38570684

ABSTRACT

Human centromeres have been traditionally very difficult to sequence and assemble owing to their repetitive nature and large size1. As a result, patterns of human centromeric variation and models for their evolution and function remain incomplete, despite centromeres being among the most rapidly mutating regions2,3. Here, using long-read sequencing, we completely sequenced and assembled all centromeres from a second human genome and compared it to the finished reference genome4,5. We find that the two sets of centromeres show at least a 4.1-fold increase in single-nucleotide variation when compared with their unique flanks and vary up to 3-fold in size. Moreover, we find that 45.8% of centromeric sequence cannot be reliably aligned using standard methods owing to the emergence of new α-satellite higher-order repeats (HORs). DNA methylation and CENP-A chromatin immunoprecipitation experiments show that 26% of the centromeres differ in their kinetochore position by >500 kb. To understand evolutionary change, we selected six chromosomes and sequenced and assembled 31 orthologous centromeres from the common chimpanzee, orangutan and macaque genomes. Comparative analyses reveal a nearly complete turnover of α-satellite HORs, with characteristic idiosyncratic changes in α-satellite HORs for each species. Phylogenetic reconstruction of human haplotypes supports limited to no recombination between the short (p) and long (q) arms across centromeres and reveals that novel α-satellite HORs share a monophyletic origin, providing a strategy to estimate the rate of saltatory amplification and mutation of human centromeric DNA.


Subject(s)
Centromere , Evolution, Molecular , Genetic Variation , Animals , Humans , Centromere/genetics , Centromere/metabolism , Centromere Protein A/metabolism , DNA Methylation/genetics , DNA, Satellite/genetics , Kinetochores/metabolism , Macaca/genetics , Pan troglodytes/genetics , Polymorphism, Single Nucleotide/genetics , Pongo/genetics , Male , Female , Reference Standards , Chromatin Immunoprecipitation , Haplotypes , Mutation , Gene Amplification , Sequence Alignment , Chromatin/genetics , Chromatin/metabolism , Species Specificity
5.
Genome Res ; 34(3): 454-468, 2024 Apr 25.
Article in English | MEDLINE | ID: mdl-38627094

ABSTRACT

Reference-free genome phasing is vital for understanding allele inheritance and the impact of single-molecule DNA variation on phenotypes. To achieve thorough phasing across homozygous or repetitive regions of the genome, long-read sequencing technologies are often used to perform phased de novo assembly. As a step toward reducing the cost and complexity of this type of analysis, we describe new methods for accurately phasing Oxford Nanopore Technologies (ONT) sequence data with the Shasta genome assembler and a modular tool for extending phasing to the chromosome scale called GFAse. We test using new variants of ONT PromethION sequencing, including those using proximity ligation, and show that newer, higher accuracy ONT reads substantially improve assembly quality.


Subject(s)
Nanopores , Humans , Sequence Analysis, DNA/methods , Nanopore Sequencing/methods , High-Throughput Nucleotide Sequencing/methods , Software , Genomics/methods
6.
Genome Res ; 34(3): 498-513, 2024 Apr 25.
Article in English | MEDLINE | ID: mdl-38508693

ABSTRACT

Hydractinia is a colonial marine hydroid that shows remarkable biological properties, including the capacity to regenerate its entire body throughout its lifetime, a process made possible by its adult migratory stem cells, known as i-cells. Here, we provide an in-depth characterization of the genomic structure and gene content of two Hydractinia species, Hydractinia symbiolongicarpus and Hydractinia echinata, placing them in a comparative evolutionary framework with other cnidarian genomes. We also generated and annotated a single-cell transcriptomic atlas for adult male H. symbiolongicarpus and identified cell-type markers for all major cell types, including key i-cell markers. Orthology analyses based on the markers revealed that Hydractinia's i-cells are highly enriched in genes that are widely shared amongst animals, a striking finding given that Hydractinia has a higher proportion of phylum-specific genes than any of the other 41 animals in our orthology analysis. These results indicate that Hydractinia's stem cells and early progenitor cells may use a toolkit shared with all animals, making it a promising model organism for future exploration of stem cell biology and regenerative medicine. The genomic and transcriptomic resources for Hydractinia presented here will enable further studies of their regenerative capacity, colonial morphology, and ability to distinguish self from nonself.


Subject(s)
Genome , Hydrozoa , Animals , Hydrozoa/genetics , Evolution, Molecular , Transcriptome , Stem Cells/metabolism , Male , Phylogeny , Single-Cell Analysis/methods
7.
bioRxiv ; 2024 Mar 19.
Article in English | MEDLINE | ID: mdl-38529488

ABSTRACT

The combination of ultra-long Oxford Nanopore (ONT) sequencing reads with long, accurate PacBio HiFi reads has enabled the completion of a human genome and spurred similar efforts to complete the genomes of many other species. However, this approach for complete, "telomere-to-telomere" genome assembly relies on multiple sequencing platforms, limiting its accessibility. ONT "Duplex" sequencing reads, where both strands of the DNA are read to improve quality, promise high per-base accuracy. To evaluate this new data type, we generated ONT Duplex data for three widely-studied genomes: human HG002, Solanum lycopersicum Heinz 1706 (tomato), and Zea mays B73 (maize). For the diploid, heterozygous HG002 genome, we also used "Pore-C" chromatin contact mapping to completely phase the haplotypes. We found the accuracy of Duplex data to be similar to HiFi sequencing, but with read lengths tens of kilobases longer, and the Pore-C data to be compatible with existing diploid assembly algorithms. This combination of read length and accuracy enables the construction of a high-quality initial assembly, which can then be further resolved using the ultra-long reads, and finally phased into chromosome-scale haplotypes with Pore-C. The resulting assemblies have a base accuracy exceeding 99.999% (Q50) and near-perfect continuity, with most chromosomes assembled as single contigs. We conclude that ONT sequencing is a viable alternative to HiFi sequencing for de novo genome assembly, and has the potential to provide a single-instrument solution for the reconstruction of complete genomes.

8.
bioRxiv ; 2024 Feb 16.
Article in English | MEDLINE | ID: mdl-38529499

ABSTRACT

Haplotype information is crucial for biomedical and population genetics research. However, current strategies to produce de-novo haplotype-resolved assemblies often require either difficult-to-acquire parental data or an intermediate haplotype-collapsed assembly. Here, we present Graphasing, a workflow which synthesizes the global phase signal of Strand-seq with assembly graph topology to produce chromosome-scale de-novo haplotypes for diploid genomes. Graphasing readily integrates with any assembly workflow that both outputs an assembly graph and has a haplotype assembly mode. Graphasing performs comparably to trio-phasing in contiguity, phasing accuracy, and assembly quality, outperforms Hi-C in phasing accuracy, and generates human assemblies with over 18 chromosome-spanning haplotypes.

9.
Nat Methods ; 21(1): 41-49, 2024 Jan.
Article in English | MEDLINE | ID: mdl-38036856

ABSTRACT

Complete, telomere-to-telomere (T2T) genome assemblies promise improved analyses and the discovery of new variants, but many essential genomic resources remain associated with older reference genomes. Thus, there is a need to translate genomic features and read alignments between references. Here we describe a method called levioSAM2 that performs fast and accurate lift-over between assemblies using a whole-genome map. In addition to enabling the use of several references, we demonstrate that aligning reads to a high-quality reference (for example, T2T-CHM13) and lifting to an older reference (for example, Genome reference Consortium (GRC)h38) improves the accuracy of the resulting variant calls on the old reference. By leveraging the quality improvements of T2T-CHM13, levioSAM2 reduces small and structural variant calling errors compared with GRC-based mapping using real short- and long-read datasets. Performance is especially improved for a set of complex medically relevant genes, where the GRC references are lower quality.


Subject(s)
Genome , Genomics , Sequence Analysis, DNA/methods , Genomics/methods , Chromosome Mapping , High-Throughput Nucleotide Sequencing
10.
Genes (Basel) ; 14(12)2023 Dec 14.
Article in English | MEDLINE | ID: mdl-38137031

ABSTRACT

BACKGROUND: Insects are a sustainable source of protein for human food and animal feed. We present a genome assembly, CRISPR gene editing, and life stage-specific transcriptomes for the yellow mealworm, Tenebrio molitor, one of the most intensively farmed insects worldwide. METHODS: Long and short reads and long-range data were obtained from a T. molitor male pupa. Sequencing transcripts from 12 T. molitor life stages resulted in 279 million reads for gene prediction and genetic engineering. A unique plasmid delivery system containing guide RNAs targeting the eye color gene vermilion flanking the muscle actin gene promoter and EGFP marker was used in CRISPR/Cas9 transformation. RESULTS: The assembly is approximately 53% of the genome size of 756.8 ± 9.6 Mb, measured using flow cytometry. Assembly was complicated by a satellitome of at least 11 highly conserved satDNAs occupying 28% of the genome. The injection of the plasmid into embryos resulted in knock-out of Tm vermilion and knock-in of EGFP. CONCLUSIONS: The genome of T. molitor is longer than current assemblies (including ours) due to a substantial amount (26.5%) of only one highly abundant satellite DNA sequence. Genetic sequences and transformation tools for an insect important to the food and feed industries will promote the sustainable utilization of mealworms and other farmed insects.


Subject(s)
Tenebrio , Animals , Male , Humans , Tenebrio/genetics , Tenebrio/metabolism , RNA, Guide, CRISPR-Cas Systems , Eye Color , Animal Feed/analysis , Larva/metabolism
11.
bioRxiv ; 2023 Aug 27.
Article in English | MEDLINE | ID: mdl-37786714

ABSTRACT

Hydractinia is a colonial marine hydroid that exhibits remarkable biological properties, including the capacity to regenerate its entire body throughout its lifetime, a process made possible by its adult migratory stem cells, known as i-cells. Here, we provide an in-depth characterization of the genomic structure and gene content of two Hydractinia species, H. symbiolongicarpus and H. echinata, placing them in a comparative evolutionary framework with other cnidarian genomes. We also generated and annotated a single-cell transcriptomic atlas for adult male H. symbiolongicarpus and identified cell type markers for all major cell types, including key i-cell markers. Orthology analyses based on the markers revealed that Hydractinia's i-cells are highly enriched in genes that are widely shared amongst animals, a striking finding given that Hydractinia has a higher proportion of phylum-specific genes than any of the other 41 animals in our orthology analysis. These results indicate that Hydractinia's stem cells and early progenitor cells may use a toolkit shared with all animals, making it a promising model organism for future exploration of stem cell biology and regenerative medicine. The genomic and transcriptomic resources for Hydractinia presented here will enable further studies of their regenerative capacity, colonial morphology, and ability to distinguish self from non-self.

12.
bioRxiv ; 2023 Aug 10.
Article in English | MEDLINE | ID: mdl-37609174

ABSTRACT

Over the decades, a small number of model species, each representative of a larger taxa, have dominated the field of biological research. Amongst fishes, zebrafish (Danio rerio) has gained popularity over most other species and while their value as a model is well documented, their usefulness is limited in certain fields of research such as behavior. By embracing other, less conventional experimental organisms, opportunities arise to gain broader insights into evolution and development, as well as studying behavioral aspects not available in current popular model systems. The anabantoid paradise fish (Macropodus opercularis), an "air-breather" species from Southeast Asia, has a highly complex behavioral repertoire and has been the subject of many ethological investigations, but lacks genomic resources. Here we report the reference genome assembly of Macropodus opercularis using long-read sequences at 150-fold coverage. The final assembly consisted of ≈483 Mb on 152 contigs. Within the assembled genome we identified and annotated 20,157 protein coding genes and assigned ≈90% of them to orthogroups. Completeness analysis showed that 98.5% of the Actinopterygii core gene set (ODB10) was present as a complete ortholog in our reference genome with a further 1.2 % being present in a fragmented form. Additionally, we cloned multiple genes important during early development and using newly developed in situ hybridization protocols, we showed that they have conserved expression patterns.

13.
Nature ; 621(7978): 344-354, 2023 Sep.
Article in English | MEDLINE | ID: mdl-37612512

ABSTRACT

The human Y chromosome has been notoriously difficult to sequence and assemble because of its complex repeat structure that includes long palindromes, tandem repeats and segmental duplications1-3. As a result, more than half of the Y chromosome is missing from the GRCh38 reference sequence and it remains the last human chromosome to be finished4,5. Here, the Telomere-to-Telomere (T2T) consortium presents the complete 62,460,029-base-pair sequence of a human Y chromosome from the HG002 genome (T2T-Y) that corrects multiple errors in GRCh38-Y and adds over 30 million base pairs of sequence to the reference, showing the complete ampliconic structures of gene families TSPY, DAZ and RBMY; 41 additional protein-coding genes, mostly from the TSPY family; and an alternating pattern of human satellite 1 and 3 blocks in the heterochromatic Yq12 region. We have combined T2T-Y with a previous assembly of the CHM13 genome4 and mapped available population variation, clinical variants and functional genomics data to produce a complete and comprehensive reference sequence for all 24 human chromosomes.


Subject(s)
Chromosomes, Human, Y , Genomics , Sequence Analysis, DNA , Humans , Base Sequence , Chromosomes, Human, Y/genetics , DNA, Satellite/genetics , Genetic Variation/genetics , Genetics, Population , Genomics/methods , Genomics/standards , Heterochromatin/genetics , Multigene Family/genetics , Reference Standards , Segmental Duplications, Genomic/genetics , Sequence Analysis, DNA/standards , Tandem Repeat Sequences/genetics , Telomere/genetics
14.
bioRxiv ; 2023 May 30.
Article in English | MEDLINE | ID: mdl-37398417

ABSTRACT

We completely sequenced and assembled all centromeres from a second human genome and used two reference sets to benchmark genetic, epigenetic, and evolutionary variation within centromeres from a diversity panel of humans and apes. We find that centromere single-nucleotide variation can increase by up to 4.1-fold relative to other genomic regions, with the caveat that up to 45.8% of centromeric sequence, on average, cannot be reliably aligned with current methods due to the emergence of new α-satellite higher-order repeat (HOR) structures and two to threefold differences in the length of the centromeres. The extent to which this occurs differs depending on the chromosome and haplotype. Comparing the two sets of complete human centromeres, we find that eight harbor distinctly different α-satellite HOR array structures and four contain novel α-satellite HOR variants in high abundance. DNA methylation and CENP-A chromatin immunoprecipitation experiments show that 26% of the centromeres differ in their kinetochore position by at least 500 kbp-a property not readily associated with novel α-satellite HORs. To understand evolutionary change, we selected six chromosomes and sequenced and assembled 31 orthologous centromeres from the common chimpanzee, orangutan, and macaque genomes. Comparative analyses reveal nearly complete turnover of α-satellite HORs, but with idiosyncratic changes in structure characteristic to each species. Phylogenetic reconstruction of human haplotypes supports limited to no recombination between the p- and q-arms of human chromosomes and reveals that novel α-satellite HORs share a monophyletic origin, providing a strategy to estimate the rate of saltatory amplification and mutation of human centromeric DNA.

15.
ArXiv ; 2023 Jun 06.
Article in English | MEDLINE | ID: mdl-37332563

ABSTRACT

Despite recent advances in the length and the accuracy of long-read data, building haplotype-resolved genome assemblies from telomere to telomere still requires considerable computational resources. In this study, we present an efficient de novo assembly algorithm that combines multiple sequencing technologies to scale up population-wide telomere-to-telomere assemblies. By utilizing twenty-two human and two plant genomes, we demonstrate that our algorithm is around an order of magnitude cheaper than existing methods, while producing better diploid and haploid assemblies. Notably, our algorithm is the only feasible solution to the haplotype-resolved assembly of polyploid genomes.

16.
Biomolecules ; 13(4)2023 03 24.
Article in English | MEDLINE | ID: mdl-37189337

ABSTRACT

Background: The house cricket, Acheta domesticus, is one of the most farmed insects worldwide and the foundation of an emerging industry using insects as a sustainable food source. Edible insects present a promising alternative for protein production amid a plethora of reports on climate change and biodiversity loss largely driven by agriculture. As with other crops, genetic resources are needed to improve crickets for food and other applications. Methods: We present the first high quality annotated genome assembly of A. domesticus from long read data and scaffolded to chromosome level, providing information needed for genetic manipulation. Results: Gene groups related to immunity were annotated and will be useful for improving value to insect farmers. Metagenome scaffolds in the A. domesticus assembly, including Invertebrate Iridescent Virus 6 (IIV6), were submitted as host-associated sequences. We demonstrate both CRISPR/Cas9-mediated knock-in and knock-out of A. domesticus and discuss implications for the food, pharmaceutical, and other industries. RNAi was demonstrated to disrupt the function of the vermilion eye-color gene producing a useful white-eye biomarker phenotype. Conclusions: We are utilizing these data to develop technologies for downstream commercial applications, including more nutritious and disease-resistant crickets, as well as lines producing valuable bioproducts, such as vaccines and antibiotics.


Subject(s)
Gryllidae , Animals , Gryllidae/genetics , Gryllidae/metabolism , Agriculture , Crops, Agricultural , Allergens/metabolism , Genetic Engineering
17.
Nature ; 617(7960): 335-343, 2023 05.
Article in English | MEDLINE | ID: mdl-37165241

ABSTRACT

The short arms of the human acrocentric chromosomes 13, 14, 15, 21 and 22 (SAACs) share large homologous regions, including ribosomal DNA repeats and extended segmental duplications1,2. Although the resolution of these regions in the first complete assembly of a human genome-the Telomere-to-Telomere Consortium's CHM13 assembly (T2T-CHM13)-provided a model of their homology3, it remained unclear whether these patterns were ancestral or maintained by ongoing recombination exchange. Here we show that acrocentric chromosomes contain pseudo-homologous regions (PHRs) indicative of recombination between non-homologous sequences. Utilizing an all-to-all comparison of the human pangenome from the Human Pangenome Reference Consortium4 (HPRC), we find that contigs from all of the SAACs form a community. A variation graph5 constructed from centromere-spanning acrocentric contigs indicates the presence of regions in which most contigs appear nearly identical between heterologous acrocentric chromosomes in T2T-CHM13. Except on chromosome 15, we observe faster decay of linkage disequilibrium in the pseudo-homologous regions than in the corresponding short and long arms, indicating higher rates of recombination6,7. The pseudo-homologous regions include sequences that have previously been shown to lie at the breakpoint of Robertsonian translocations8, and their arrangement is compatible with crossover in inverted duplications on chromosomes 13, 14 and 21. The ubiquity of signals of recombination between heterologous acrocentric chromosomes seen in the HPRC draft pangenome suggests that these shared sequences form the basis for recurrent Robertsonian translocations, providing sequence and population-based confirmation of hypotheses first developed from cytogenetic studies 50 years ago9.


Subject(s)
Centromere , Chromosomes, Human , Recombination, Genetic , Humans , Centromere/genetics , Chromosomes, Human/genetics , DNA, Ribosomal/genetics , Recombination, Genetic/genetics , Translocation, Genetic/genetics , Cytogenetics , Telomere/genetics
18.
BMC Biol ; 21(1): 67, 2023 04 03.
Article in English | MEDLINE | ID: mdl-37013528

ABSTRACT

BACKGROUND: Channel catfish and blue catfish are the most important aquacultured species in the USA. The species do not readily intermate naturally but F1 hybrids can be produced through artificial spawning. F1 hybrids produced by mating channel catfish female with blue catfish male exhibit heterosis and provide an ideal system to study reproductive isolation and hybrid vigor. The purpose of the study was to generate high-quality chromosome level reference genome sequences and to determine their genomic similarities and differences. RESULTS: We present high-quality reference genome sequences for both channel catfish and blue catfish, containing only 67 and 139 total gaps, respectively. We also report three pericentric chromosome inversions between the two genomes, as evidenced by long reads across the inversion junctions from distinct individuals, genetic linkage mapping, and PCR amplicons across the inversion junctions. Recombination rates within the inversional segments, detected as double crossovers, are extremely low among backcross progenies (progenies of channel catfish female × F1 hybrid male), suggesting that the pericentric inversions interrupt postzygotic recombination or survival of recombinants. Identification of channel catfish- and blue catfish-specific genes, along with expansions of immunoglobulin genes and centromeric Xba elements, provides insights into genomic hallmarks of these species. CONCLUSIONS: We generated high-quality reference genome sequences for both blue catfish and channel catfish and identified major chromosomal inversions on chromosomes 6, 11, and 24. These perimetric inversions were validated by additional sequencing analysis, genetic linkage mapping, and PCR analysis across the inversion junctions. The reference genome sequences, as well as the contrasted chromosomal architecture should provide guidance for the interspecific breeding programs.


Subject(s)
Ictaluridae , Humans , Animals , Male , Female , Ictaluridae/genetics , Chromosome Inversion , Genetic Linkage , Genome , Chromosome Mapping
19.
bioRxiv ; 2023 Feb 22.
Article in English | MEDLINE | ID: mdl-36865218

ABSTRACT

As a step towards simplifying and reducing the cost of haplotype resolved de novo assembly, we describe new methods for accurately phasing nanopore data with the Shasta genome assembler and a modular tool for extending phasing to the chromosome scale called GFAse. We test using new variants of Oxford Nanopore Technologies' (ONT) PromethION sequencing, including those using proximity ligation and show that newer, higher accuracy ONT reads substantially improve assembly quality.

20.
Nat Biotechnol ; 41(10): 1474-1482, 2023 Oct.
Article in English | MEDLINE | ID: mdl-36797493

ABSTRACT

The Telomere-to-Telomere consortium recently assembled the first truly complete sequence of a human genome. To resolve the most complex repeats, this project relied on manual integration of ultra-long Oxford Nanopore sequencing reads with a high-resolution assembly graph built from long, accurate PacBio high-fidelity reads. We have improved and automated this strategy in Verkko, an iterative, graph-based pipeline for assembling complete, diploid genomes. Verkko begins with a multiplex de Bruijn graph built from long, accurate reads and progressively simplifies this graph by integrating ultra-long reads and haplotype-specific markers. The result is a phased, diploid assembly of both haplotypes, with many chromosomes automatically assembled from telomere to telomere. Running Verkko on the HG002 human genome resulted in 20 of 46 diploid chromosomes assembled without gaps at 99.9997% accuracy. The complete assembly of diploid genomes is a critical step towards the construction of comprehensive pangenome databases and chromosome-scale comparative genomics.


Subject(s)
Diploidy , Genomics , Humans , Sequence Analysis, DNA/methods , Genomics/methods , Genome, Human/genetics , Telomere/genetics , High-Throughput Nucleotide Sequencing/methods
SELECTION OF CITATIONS
SEARCH DETAIL
...