Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 36
Filter
1.
PLoS Biol ; 22(6): e3002661, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38829909

ABSTRACT

Deuterostomes are a monophyletic group of animals that includes Hemichordata, Echinodermata (together called Ambulacraria), and Chordata. The diversity of deuterostome body plans has made it challenging to reconstruct their ancestral condition and to decipher the genetic changes that drove the diversification of deuterostome lineages. Here, we generate chromosome-level genome assemblies of 2 hemichordate species, Ptychodera flava and Schizocardium californicum, and use comparative genomic approaches to infer the chromosomal architecture of the deuterostome common ancestor and delineate lineage-specific chromosomal modifications. We show that hemichordate chromosomes (1N = 23) exhibit remarkable chromosome-scale macrosynteny when compared to other deuterostomes and can be derived from 24 deuterostome ancestral linkage groups (ALGs). These deuterostome ALGs in turn match previously inferred bilaterian ALGs, consistent with a relatively short transition from the last common bilaterian ancestor to the origin of deuterostomes. Based on this deuterostome ALG complement, we deduced chromosomal rearrangement events that occurred in different lineages. For example, a fusion-with-mixing event produced an Ambulacraria-specific ALG that subsequently split into 2 chromosomes in extant hemichordates, while this homologous ALG further fused with another chromosome in sea urchins. Orthologous genes distributed in these rearranged chromosomes are enriched for functions in various developmental processes. We found that the deeply conserved Hox clusters are located in highly rearranged chromosomes and that maintenance of the clusters are likely due to lower densities of transposable elements within the clusters. We also provide evidence that the deuterostome-specific pharyngeal gene cluster was established via the combination of 3 pre-assembled microsyntenic blocks. We suggest that since chromosomal rearrangement events and formation of new gene clusters may change the regulatory controls of developmental genes, these events may have contributed to the evolution of diverse body plans among deuterostomes.


Subject(s)
Chromosomes , Evolution, Molecular , Genome , Phylogeny , Animals , Chromosomes/genetics , Genome/genetics , Synteny , Genetic Linkage , Chordata/genetics
2.
Front Psychiatry ; 14: 980739, 2023.
Article in English | MEDLINE | ID: mdl-37113548

ABSTRACT

Introduction: The therapeutic relationship continues to be one of the most important factors in therapeutic outcomes. Given the place of emotion in the definition of the therapeutic relationship, as well as the demonstrated positive impact that emotional expression has on therapeutic process and outcome, it stands to reason that studying the emotional exchange between the therapist and client further would be warranted. Methods: This study used a validated observational coding system--the Specific Affect Coding System (SPAFF) and a theoretical mathematical model to analyze behaviors which make up the therapeutic relationship. Specifically, the researchers used to codify relationship-building behaviors between an expert therapist and his client over the course of six sessions. Dynamical systems mathematical modeling was also employed to create "phase space portraits" depicting the relational dynamics between the master therapist and his client over six sessions. Results: Statistical analysis was used to compare SPAFF codes and model parameters between the expert therapist and his client. The expert therapist showed stability in affect codes over six sessions while the client's affect codes appeared to be more flexible over time, though model parameters remained stable across the six sessions. Finally, phase space portraits depicted the evolution of the affective dynamics between the master therapist and his client as the relationship matured. Discussion: The clinician's ability to stay emotionally positive and relatively stable across the six sessions (relative to the client) was noteworthy. It formed the basis for a stable base from which she could explore alternative methods to relate to others that she had allowed to dictate her actions, which is in keeping with previous research on the role of therapist facilitation of the therapeutic relationship, emotional expression within the therapeutic relationship, and influence of these on client outcomes. These results provide a valuable foundation for future research on emotional expression as a key component of the therapeutic relationship in psychotherapy.

3.
Psychotherapy (Chic) ; 60(3): 283-294, 2023 09.
Article in English | MEDLINE | ID: mdl-36931813

ABSTRACT

This article outlines the evidence base for the use of paradoxical interventions (PIs) in individual psychotherapy. Often misunderstood, PIs have shown long-term (distal) impacts on clinical outcomes, yet a review of the existing literature on these interventions illustrates a trending decline in consideration and use within both research and applied settings. Definitions of PIs and their constituent elements are presented along with clinical examples. We conducted one meta-analysis comparing PIs with a placebo or control and another comparing PIs to other therapeutic methods. PIs demonstrated a large effect (d = 1.1, k = 17 studies) compared to controls and a medium effect size (d = .49, k = 17 studies) compared to other therapeutic methods. We included a review of several case studies using PIs as well. Among the salient findings, there is a lack of assessment measure to track the implementation of PIs in session or a method to track their in-session effects. Further, there is a dearth of contemporary quantitative experimental research and development of PIs. We further advocate for the development and integration of PI training and supervision into clinical education and posteducation programs, given the current data demonstrating clinical utility. (PsycInfo Database Record (c) 2023 APA, all rights reserved).


Subject(s)
Psychotherapy , Humans , Educational Status
4.
Psychother Res ; 32(2): 223-237, 2022 02.
Article in English | MEDLINE | ID: mdl-33955816

ABSTRACT

Objective: The purpose of this paper is to describe an approach to dynamical systems (DS) using a set of differential equations, and how an application of these equations can be used to address a critical element of the therapeutic relationship. Using APA's Three Approaches to Psychotherapy with a Female Client: The Next Generation and Three Approaches to Psychotherapy with a Male Client: The Next Generation videos, DS models were created for each of the six sessions with expert clinicians (Judith Beck, Leslie Greenberg, and Nancy McWilliams) from the three theoretical approaches. Method: A second-by-second observational coding system of the emotional exchanges of the therapists and clients was used as the data for the equations. Results: DS modeling allowed for a side-by-side comparison between the three approaches as well as between the two clients. Examining the graphs created by plotting the results of the DS equations (in particular, phase-space portraits) revealed that there were similarities among the three theoretical approaches, and there were notable differences between the two clients. Conclusions: DS modelling can provide researchers and clinicians with a powerful tool to investigate the complex phenomenon that is psychotherapy.


Subject(s)
Professional-Patient Relations , Psychotherapy , Female , Humans , Male , Psychotherapy/methods
5.
Nat Commun ; 12(1): 1935, 2021 04 28.
Article in English | MEDLINE | ID: mdl-33911078

ABSTRACT

Haplotype-resolved genome assemblies are important for understanding how combinations of variants impact phenotypes. To date, these assemblies have been best created with complex protocols, such as cultured cells that contain a single-haplotype (haploid) genome, single cells where haplotypes are separated, or co-sequencing of parental genomes in a trio-based approach. These approaches are impractical in most situations. To address this issue, we present FALCON-Phase, a phasing tool that uses ultra-long-range Hi-C chromatin interaction data to extend phase blocks of partially-phased diploid assembles to chromosome or scaffold scale. FALCON-Phase uses the inherent phasing information in Hi-C reads, skipping variant calling, and reduces the computational complexity of phasing. Our method is validated on three benchmark datasets generated as part of the Vertebrate Genomes Project (VGP), including human, cow, and zebra finch, for which high-quality, fully haplotype-resolved assemblies are available using the trio-based approach. FALCON-Phase is accurate without having parental data and performance is better in samples with higher heterozygosity. For cow and zebra finch the accuracy is 97% compared to 80-91% for human. FALCON-Phase is applicable to any draft assembly that contains long primary contigs and phased associate contigs.


Subject(s)
Contig Mapping/methods , Genome, Human/genetics , High-Throughput Nucleotide Sequencing/methods , Sequence Analysis, DNA/methods , Algorithms , Animals , Cattle , Haplotypes/genetics , Humans , Polymorphism, Single Nucleotide/genetics , Zebrafish/genetics
6.
Plant Genome ; 14(1): e20072, 2021 03.
Article in English | MEDLINE | ID: mdl-33605092

ABSTRACT

Hop (Humulus lupulus L. var Lupulus) is a diploid, dioecious plant with a history of cultivation spanning more than one thousand years. Hop cones are valued for their use in brewing and contain compounds of therapeutic interest including xanthohumol. Efforts to determine how biochemical pathways responsible for desirable traits are regulated have been challenged by the large (2.8 Gb), repetitive, and heterozygous genome of hop. We present a draft haplotype-phased assembly of the Cascade cultivar genome. Our draft assembly and annotation of the Cascade genome is the most extensive representation of the hop genome to date. PacBio long-read sequences from hop were assembled with FALCON and partially phased with FALCON-Unzip. Comparative analysis of haplotype sequences provides insight into selective pressures that have driven evolution in hop. We discovered genes with greater sequence divergence enriched for stress-response, growth, and flowering functions in the draft phased assembly. With improved resolution of long terminal retrotransposons (LTRs) due to long-read sequencing, we found that hop is over 70% repetitive. We identified a homolog of cannabidiolic acid synthase (CBDAS) that is expressed in multiple tissues. The approaches we developed to analyze the draft phased assembly serve to deepen our understanding of the genomic landscape of hop and may have broader applicability to the study of other large, complex genomes.


Subject(s)
Humulus , Diploidy , Genome, Plant , Genomics , Haplotypes , Humulus/genetics
7.
Nat Biotechnol ; 39(3): 309-312, 2021 03.
Article in English | MEDLINE | ID: mdl-33288905

ABSTRACT

Haplotype-resolved or phased genome assembly provides a complete picture of genomes and their complex genetic variations. However, current algorithms for phased assembly either do not generate chromosome-scale phasing or require pedigree information, which limits their application. We present a method named diploid assembly (DipAsm) that uses long, accurate reads and long-range conformation data for single individuals to generate a chromosome-scale phased assembly within 1 day. Applied to four public human genomes, PGP1, HG002, NA12878 and HG00733, DipAsm produced haplotype-resolved assemblies with minimum contig length needed to cover 50% of the known genome (NG50) up to 25 Mb and phased ~99.5% of heterozygous sites at 98-99% accuracy, outperforming other approaches in terms of both contiguity and phasing completeness. We demonstrate the importance of chromosome-scale phased assemblies for the discovery of structural variants (SVs), including thousands of new transposon insertions, and of highly polymorphic and medically important regions such as the human leukocyte antigen (HLA) and killer cell immunoglobulin-like receptor (KIR) regions. DipAsm will facilitate high-quality precision medicine and studies of individual haplotype variation and population diversity.


Subject(s)
Chromosomes, Human , Genome, Human , Haplotypes , Algorithms , Heterozygote , Humans , Polymorphism, Single Nucleotide
8.
Sci Data ; 7(1): 399, 2020 11 17.
Article in English | MEDLINE | ID: mdl-33203859

ABSTRACT

The PacBio® HiFi sequencing method yields highly accurate long-read sequencing datasets with read lengths averaging 10-25 kb and accuracies greater than 99.5%. These accurate long reads can be used to improve results for complex applications such as single nucleotide and structural variant detection, genome assembly, assembly of difficult polyploid or highly repetitive genomes, and assembly of metagenomes. Currently, there is a need for sample data sets to both evaluate the benefits of these long accurate reads as well as for development of bioinformatic tools including genome assemblers, variant callers, and haplotyping algorithms. We present deep coverage HiFi datasets for five complex samples including the two inbred model genomes Mus musculus and Zea mays, as well as two complex genomes, octoploid Fragaria × ananassa and the diploid anuran Rana muscosa. Additionally, we release sequence data from a mock metagenome community. The datasets reported here can be used without restriction to develop new algorithms and explore complex genome structure and evolution. Data were generated on the PacBio Sequel II System.


Subject(s)
High-Throughput Nucleotide Sequencing , Mice/genetics , Zea mays/genetics , Animals , Fragaria/genetics , Genome, Plant , Metagenome , Ranidae/genetics , Sequence Analysis, DNA
9.
G3 (Bethesda) ; 10(9): 2911-2925, 2020 09 02.
Article in English | MEDLINE | ID: mdl-32631951

ABSTRACT

In recent years, improved sequencing technology and computational tools have made de novo genome assembly more accessible. Many approaches, however, generate either an unphased or only partially resolved representation of a diploid genome, in which polymorphisms are detected but not assigned to one or the other of the homologous chromosomes. Yet chromosomal phase information is invaluable for the understanding of phenotypic trait inheritance in the cases of compound heterozygosity, allele-specific expression or cis-acting variants. Here we use a combination of tools and sequencing technologies to generate a de novo diploid assembly of the human primary cell line WI-38. First, data from PacBio single molecule sequencing and Bionano Genomics optical mapping were combined to generate an unphased assembly. Next, 10x Genomics linked reads were combined with the hybrid assembly to generate a partially phased assembly. Lastly, we developed and optimized methods to use short-read (Illumina) sequencing of flow cytometry-sorted metaphase chromosomes to provide phase information. The final genome assembly was almost fully (94%) phased with the addition of approximately 2.5-fold coverage of Illumina data from the sequenced metaphase chromosomes. The diploid nature of the final de novo genome assembly improved the resolution of structural variants between the WI-38 genome and the human reference genome. The phased WI-38 sequence data are available for browsing and download at wi38.research.calicolabs.com. Our work shows that assembling a completely phased diploid genome de novo from the DNA of a single individual is now readily achievable.


Subject(s)
Diploidy , Genome, Human , DNA , High-Throughput Nucleotide Sequencing , Humans , Sequence Analysis, DNA
10.
Ann Hum Genet ; 84(2): 125-140, 2020 03.
Article in English | MEDLINE | ID: mdl-31711268

ABSTRACT

The sequence and assembly of human genomes using long-read sequencing technologies has revolutionized our understanding of structural variation and genome organization. We compared the accuracy, continuity, and gene annotation of genome assemblies generated from either high-fidelity (HiFi) or continuous long-read (CLR) datasets from the same complete hydatidiform mole human genome. We find that the HiFi sequence data assemble an additional 10% of duplicated regions and more accurately represent the structure of tandem repeats, as validated with orthogonal analyses. As a result, an additional 5 Mbp of pericentromeric sequences are recovered in the HiFi assembly, resulting in a 2.5-fold increase in the NG50 within 1 Mbp of the centromere (HiFi 480.6 kbp, CLR 191.5 kbp). Additionally, the HiFi genome assembly was generated in significantly less time with fewer computational resources than the CLR assembly. Although the HiFi assembly has significantly improved continuity and accuracy in many complex regions of the genome, it still falls short of the assembly of centromeric DNA and the largest regions of segmental duplication using existing assemblers. Despite these shortcomings, our results suggest that HiFi may be the most effective standalone technology for de novo assembly of human genomes.


Subject(s)
Biomarkers/analysis , Genetic Variation , Genome, Human , Haploidy , Hydatidiform Mole/genetics , Sequence Analysis, DNA/methods , Single-Cell Analysis/methods , Female , High-Throughput Nucleotide Sequencing , Humans , Molecular Sequence Annotation , Pregnancy
11.
Nat Biotechnol ; 37(10): 1155-1162, 2019 10.
Article in English | MEDLINE | ID: mdl-31406327

ABSTRACT

The DNA sequencing technologies in use today produce either highly accurate short reads or less-accurate long reads. We report the optimization of circular consensus sequencing (CCS) to improve the accuracy of single-molecule real-time (SMRT) sequencing (PacBio) and generate highly accurate (99.8%) long high-fidelity (HiFi) reads with an average length of 13.5 kilobases (kb). We applied our approach to sequence the well-characterized human HG002/NA24385 genome and obtained precision and recall rates of at least 99.91% for single-nucleotide variants (SNVs), 95.98% for insertions and deletions <50 bp (indels) and 95.99% for structural variants. Our CCS method matches or exceeds the ability of short-read sequencing to detect small variants and structural variants. We estimate that 2,434 discordances are correctable mistakes in the 'genome in a bottle' (GIAB) benchmark set. Nearly all (99.64%) variants can be phased into haplotypes, further improving variant detection. De novo genome assembly using CCS reads alone produced a contiguous and accurate genome with a contig N50 of >15 megabases (Mb) and concordance of 99.997%, substantially outperforming assembly with less-accurate long reads.


Subject(s)
DNA, Circular/genetics , Genome, Human , High-Throughput Nucleotide Sequencing/methods , Sequence Analysis, DNA/methods , Base Sequence , Genetic Variation , Haplotypes , Humans
12.
Nature ; 563(7732): 501-507, 2018 11.
Article in English | MEDLINE | ID: mdl-30429615

ABSTRACT

Female Aedes aegypti mosquitoes infect more than 400 million people each year with dangerous viral pathogens including dengue, yellow fever, Zika and chikungunya. Progress in understanding the biology of mosquitoes and developing the tools to fight them has been slowed by the lack of a high-quality genome assembly. Here we combine diverse technologies to produce the markedly improved, fully re-annotated AaegL5 genome assembly, and demonstrate how it accelerates mosquito science. We anchored physical and cytogenetic maps, doubled the number of known chemosensory ionotropic receptors that guide mosquitoes to human hosts and egg-laying sites, provided further insight into the size and composition of the sex-determining M locus, and revealed copy-number variation among glutathione S-transferase genes that are important for insecticide resistance. Using high-resolution quantitative trait locus and population genomic analyses, we mapped new candidates for dengue vector competence and insecticide resistance. AaegL5 will catalyse new biological insights and intervention strategies to fight this deadly disease vector.


Subject(s)
Aedes/genetics , Arbovirus Infections/virology , Arboviruses , Genome, Insect/genetics , Genomics/standards , Insect Control , Mosquito Vectors/genetics , Mosquito Vectors/virology , Aedes/virology , Animals , Arbovirus Infections/transmission , Arboviruses/isolation & purification , DNA Copy Number Variations/genetics , Dengue Virus/isolation & purification , Female , Genetic Variation/genetics , Genetics, Population , Glutathione Transferase/genetics , Insecticide Resistance/drug effects , Male , Molecular Sequence Annotation , Multigene Family/genetics , Pyrethrins/pharmacology , Reference Standards , Sex Determination Processes/genetics
13.
Psychotherapy (Chic) ; 55(4): 461-472, 2018 12.
Article in English | MEDLINE | ID: mdl-30335458

ABSTRACT

Although emotion has long been considered important to psychotherapeutic process, empirical assessment of its impact has emerged only recently. The present study applied two meta-analyses to explore the association between therapist expression of emotion and psychotherapy outcome, and client expression of emotion and psychotherapy outcome. Overall, 66 studies (13 for the therapist meta-analysis and 43 for the client meta-analysis) were included. A significant medium effect size was found between the therapist's emotional expression and outcomes (d = 0.56) and a significant medium-to-large effect size between the client's emotional expression and outcomes (d = 0.85). Third-party rating of emotional expression emerged as a significant moderator of outcomes. Limitations of the research, diversity considerations, and therapeutic practices that conclude the article are then presented. (PsycINFO Database Record (c) 2018 APA, all rights reserved).


Subject(s)
Attitude of Health Personnel , Emotions , Mental Disorders/psychology , Mental Disorders/therapy , Professional-Patient Relations , Psychotherapy/methods , Humans , Treatment Outcome
14.
Sci Rep ; 8(1): 525, 2018 01 11.
Article in English | MEDLINE | ID: mdl-29323202

ABSTRACT

There is a need to clarify relationships within the actinobacterial genus Micromonospora, the type genus of the family Micromonosporaceae, given its biotechnological and ecological importance. Here, draft genomes of 40 Micromonospora type strains and two non-type strains are made available through the Genomic Encyclopedia of Bacteria and Archaea project and used to generate a phylogenomic tree which showed they could be assigned to well supported phyletic lines that were not evident in corresponding trees based on single and concatenated sequences of conserved genes. DNA G+C ratios derived from genome sequences showed that corresponding data from species descriptions were imprecise. Emended descriptions include precise base composition data and approximate genome sizes of the type strains. antiSMASH analyses of the draft genomes show that micromonosporae have a previously unrealised potential to synthesize novel specialized metabolites. Close to one thousand biosynthetic gene clusters were detected, including NRPS, PKS, terpenes and siderophores clusters that were discontinuously distributed thereby opening up the prospect of prioritising gifted strains for natural product discovery. The distribution of key stress related genes provide an insight into how micromonosporae adapt to key environmental variables. Genes associated with plant interactions highlight the potential use of micromonosporae in agriculture and biotechnology.


Subject(s)
Genome, Bacterial , Industrial Microbiology/methods , Micromonospora/genetics , Phylogeny , Base Composition , Micromonospora/classification , Micromonospora/metabolism
16.
Nat Commun ; 8(1): 1899, 2017 12 01.
Article in English | MEDLINE | ID: mdl-29196618

ABSTRACT

Crassulacean acid metabolism (CAM) is a water-use efficient adaptation of photosynthesis that has evolved independently many times in diverse lineages of flowering plants. We hypothesize that convergent evolution of protein sequence and temporal gene expression underpins the independent emergences of CAM from C3 photosynthesis. To test this hypothesis, we generate a de novo genome assembly and genome-wide transcript expression data for Kalanchoë fedtschenkoi, an obligate CAM species within the core eudicots with a relatively small genome (~260 Mb). Our comparative analyses identify signatures of convergence in protein sequence and re-scheduling of diel transcript expression of genes involved in nocturnal CO2 fixation, stomatal movement, heat tolerance, circadian clock, and carbohydrate metabolism in K. fedtschenkoi and other CAM species in comparison with non-CAM species. These findings provide new insights into molecular convergence and building blocks of CAM and will facilitate CAM-into-C3 photosynthesis engineering to enhance water-use efficiency in crops.


Subject(s)
Acids/metabolism , Evolution, Molecular , Genome, Plant , Kalanchoe/genetics , Carbon Dioxide/metabolism , Gene Duplication , Kalanchoe/classification , Kalanchoe/metabolism , Photosynthesis , Phylogeny , Plants/classification , Plants/genetics , Plants/metabolism , Water/metabolism
17.
Nature ; 546(7659): 524-527, 2017 06 22.
Article in English | MEDLINE | ID: mdl-28605751

ABSTRACT

Complete and accurate reference genomes and annotations provide fundamental tools for characterization of genetic and functional variation. These resources facilitate the determination of biological processes and support translation of research findings into improved and sustainable agricultural technologies. Many reference genomes for crop plants have been generated over the past decade, but these genomes are often fragmented and missing complex repeat regions. Here we report the assembly and annotation of a reference genome of maize, a genetic and agricultural model species, using single-molecule real-time sequencing and high-resolution optical mapping. Relative to the previous reference genome, our assembly features a 52-fold increase in contig length and notable improvements in the assembly of intergenic spaces and centromeres. Characterization of the repetitive portion of the genome revealed more than 130,000 intact transposable elements, allowing us to identify transposable element lineage expansions that are unique to maize. Gene annotations were updated using 111,000 full-length transcripts obtained by single-molecule real-time sequencing. In addition, comparative optical mapping of two other inbred maize lines revealed a prevalence of deletions in regions of low gene density and maize lineage-specific genes.


Subject(s)
Genome, Plant/genetics , High-Throughput Nucleotide Sequencing/methods , Single Molecule Imaging/methods , Zea mays/genetics , Centromere/genetics , Chromosomes, Plant/genetics , Contig Mapping , Crops, Agricultural/genetics , DNA Transposable Elements/genetics , DNA, Intergenic/genetics , Genes, Plant/genetics , Molecular Sequence Annotation , Optics and Photonics , Phylogeny , RNA, Messenger/analysis , RNA, Messenger/genetics , Reference Standards , Sorghum/genetics
18.
Genome Res ; 27(5): 849-864, 2017 05.
Article in English | MEDLINE | ID: mdl-28396521

ABSTRACT

The human reference genome assembly plays a central role in nearly all aspects of today's basic and clinical research. GRCh38 is the first coordinate-changing assembly update since 2009; it reflects the resolution of roughly 1000 issues and encompasses modifications ranging from thousands of single base changes to megabase-scale path reorganizations, gap closures, and localization of previously orphaned sequences. We developed a new approach to sequence generation for targeted base updates and used data from new genome mapping technologies and single haplotype resources to identify and resolve larger assembly issues. For the first time, the reference assembly contains sequence-based representations for the centromeres. We also expanded the number of alternate loci to create a reference that provides a more robust representation of human population variation. We demonstrate that the updates render the reference an improved annotation substrate, alter read alignments in unchanged regions, and impact variant interpretation at clinically relevant loci. We additionally evaluated a collection of new de novo long-read haploid assemblies and conclude that although the new assemblies compare favorably to the reference with respect to continuity, error rate, and gene completeness, the reference still provides the best representation for complex genomic regions and coding sequences. We assert that the collected updates in GRCh38 make the newer assembly a more robust substrate for comprehensive analyses that will promote our understanding of human biology and advance our efforts to improve health.


Subject(s)
Contig Mapping/methods , Genome, Human , Genomics/methods , Sequence Analysis, DNA/methods , Software , Contig Mapping/standards , Genomics/standards , Haploidy , Haplotypes , Humans , Polymorphism, Genetic , Reference Standards , Sequence Analysis, DNA/standards
19.
BMC Genomics ; 18(1): 95, 2017 01 18.
Article in English | MEDLINE | ID: mdl-28100185

ABSTRACT

BACKGROUND: The first Atlantic cod (Gadus morhua) genome assembly published in 2011 was one of the early genome assemblies exclusively based on high-throughput 454 pyrosequencing. Since then, rapid advances in sequencing technologies have led to a multitude of assemblies generated for complex genomes, although many of these are of a fragmented nature with a significant fraction of bases in gaps. The development of long-read sequencing and improved software now enable the generation of more contiguous genome assemblies. RESULTS: By combining data from Illumina, 454 and the longer PacBio sequencing technologies, as well as integrating the results of multiple assembly programs, we have created a substantially improved version of the Atlantic cod genome assembly. The sequence contiguity of this assembly is increased fifty-fold and the proportion of gap-bases has been reduced fifteen-fold. Compared to other vertebrates, the assembly contains an unusual high density of tandem repeats (TRs). Indeed, retrospective analyses reveal that gaps in the first genome assembly were largely associated with these TRs. We show that 21% of the TRs across the assembly, 19% in the promoter regions and 12% in the coding sequences are heterozygous in the sequenced individual. CONCLUSIONS: The inclusion of PacBio reads combined with the use of multiple assembly programs drastically improved the Atlantic cod genome assembly by successfully resolving long TRs. The high frequency of heterozygous TRs within or in the vicinity of genes in the genome indicate a considerable standing genomic variation in Atlantic cod populations, which is likely of evolutionary importance.


Subject(s)
Gadus morhua/genetics , Genomics/methods , Tandem Repeat Sequences/genetics , Animals , Heterozygote , Molecular Sequence Annotation , Promoter Regions, Genetic , Sequence Analysis, DNA
20.
Genome Res ; 27(5): 677-685, 2017 05.
Article in English | MEDLINE | ID: mdl-27895111

ABSTRACT

In an effort to more fully understand the full spectrum of human genetic variation, we generated deep single-molecule, real-time (SMRT) sequencing data from two haploid human genomes. By using an assembly-based approach (SMRT-SV), we systematically assessed each genome independently for structural variants (SVs) and indels resolving the sequence structure of 461,553 genetic variants from 2 bp to 28 kbp in length. We find that >89% of these variants have been missed as part of analysis of the 1000 Genomes Project even after adjusting for more common variants (MAF > 1%). We estimate that this theoretical human diploid differs by as much as ∼16 Mbp with respect to the human reference, with long-read sequencing data providing a fivefold increase in sensitivity for genetic variants ranging in size from 7 bp to 1 kbp compared with short-read sequence data. Although a large fraction of genetic variants were not detected by short-read approaches, once the alternate allele is sequence-resolved, we show that 61% of SVs can be genotyped in short-read sequence data sets with high accuracy. Uncoupling discovery from genotyping thus allows for the majority of this missed common variation to be genotyped in the human population. Interestingly, when we repeat SV detection on a pseudodiploid genome constructed in silico by merging the two haploids, we find that ∼59% of the heterozygous SVs are no longer detected by SMRT-SV. These results indicate that haploid resolution of long-read sequencing data will significantly increase sensitivity of SV detection.


Subject(s)
Contig Mapping/methods , Genome, Human , Genomic Structural Variation , Haploidy , Sequence Analysis, DNA/methods , Contig Mapping/standards , Human Genome Project , Humans , Sequence Analysis, DNA/standards
SELECTION OF CITATIONS
SEARCH DETAIL
...