Pesquisa | Portal Regional da BVS (teste)

Inverted triplications formed by iterative template switches generate structural variant diversity at genomic disorder loci.

Grochowski, Christopher M; Bengtsson, Jesse D; Du, Haowei; Gandhi, Mira; Lun, Ming Yin; Mehaffey, Michele G; Park, KyungHee; Höps, Wolfram; Benito, Eva; Hasenfeld, Patrick; Korbel, Jan O; Mahmoud, Medhat; Paulin, Luis F; Jhangiani, Shalini N; Hwang, James Paul; Bhamidipati, Sravya V; Muzny, Donna M; Fatih, Jawid M; Gibbs, Richard A; Pendleton, Matthew; Harrington, Eoghan; Juul, Sissel; Lindstrand, Anna; Sedlazeck, Fritz J; Pehlivan, Davut; Lupski, James R; Carvalho, Claudia M B.

Cell Genom ; 4(7): 100590, 2024 Jul 10.

Artigo em Inglês | MEDLINE | ID: mdl-38908378

RESUMO

The duplication-triplication/inverted-duplication (DUP-TRP/INV-DUP) structure is a complex genomic rearrangement (CGR). Although it has been identified as an important pathogenic DNA mutation signature in genomic disorders and cancer genomes, its architecture remains unresolved. Here, we studied the genomic architecture of DUP-TRP/INV-DUP by investigating the DNA of 24 patients identified by array comparative genomic hybridization (aCGH) on whom we found evidence for the existence of 4 out of 4 predicted structural variant (SV) haplotypes. Using a combination of short-read genome sequencing (GS), long-read GS, optical genome mapping, and single-cell DNA template strand sequencing (strand-seq), the haplotype structure was resolved in 18 samples. The point of template switching in 4 samples was shown to be a segment of â¼2.2-5.5 kb of 100% nucleotide similarity within inverted repeat pairs. These data provide experimental evidence that inverted low-copy repeats act as recombinant substrates. This type of CGR can result in multiple conformers generating diverse SV haplotypes in susceptible dosage-sensitive loci.

Assuntos

Haplótipos , Humanos , Haplótipos/genética , Hibridização Genômica Comparativa , Variação Estrutural do Genoma/genética , Genoma Humano/genética , Duplicação Gênica/genética

Break-induced replication underlies formation of inverted triplications and generates unexpected diversity in haplotype structures.

Grochowski, Christopher M; Bengtsson, Jesse D; Du, Haowei; Gandhi, Mira; Lun, Ming Yin; Mehaffey, Michele G; Park, KyungHee; Höps, Wolfram; Benito-Garagorri, Eva; Hasenfeld, Patrick; Korbel, Jan O; Mahmoud, Medhat; Paulin, Luis F; Jhangiani, Shalini N; Muzny, Donna M; Fatih, Jawid M; Gibbs, Richard A; Pendleton, Matthew; Harrington, Eoghan; Juul, Sissel; Lindstrand, Anna; Sedlazeck, Fritz J; Pehlivan, Davut; Lupski, James R; Carvalho, Claudia M B.

bioRxiv ; 2023 Oct 03.

Artigo em Inglês | MEDLINE | ID: mdl-37873367

RESUMO

Background: The duplication-triplication/inverted-duplication (DUP-TRP/INV-DUP) structure is a type of complex genomic rearrangement (CGR) hypothesized to result from replicative repair of DNA due to replication fork collapse. It is often mediated by a pair of inverted low-copy repeats (LCR) followed by iterative template switches resulting in at least two breakpoint junctions in cis . Although it has been identified as an important mutation signature of pathogenicity for genomic disorders and cancer genomes, its architecture remains unresolved and is predicted to display at least four structural variation (SV) haplotypes. Results: Here we studied the genomic architecture of DUP-TRP/INV-DUP by investigating the genomic DNA of 24 patients with neurodevelopmental disorders identified by array comparative genomic hybridization (aCGH) on whom we found evidence for the existence of 4 out of 4 predicted SV haplotypes. Using a combination of short-read genome sequencing (GS), long- read GS, optical genome mapping and StrandSeq the haplotype structure was resolved in 18 samples. This approach refined the point of template switching between inverted LCRs in 4 samples revealing a DNA segment of â¼2.2-5.5 kb of 100% nucleotide similarity. A prediction model was developed to infer the LCR used to mediate the non-allelic homology repair. Conclusions: These data provide experimental evidence supporting the hypothesis that inverted LCRs act as a recombinant substrate in replication-based repair mechanisms. Such inverted repeats are particularly relevant for formation of copy-number associated inversions, including the DUP-TRP/INV-DUP structures. Moreover, this type of CGR can result in multiple conformers which contributes to generate diverse SV haplotypes in susceptible loci .

Assembly of 43 human Y chromosomes reveals extensive complexity and variation.

Hallast, Pille; Ebert, Peter; Loftus, Mark; Yilmaz, Feyza; Audano, Peter A; Logsdon, Glennis A; Bonder, Marc Jan; Zhou, Weichen; Höps, Wolfram; Kim, Kwondo; Li, Chong; Hoyt, Savannah J; Dishuck, Philip C; Porubsky, David; Tsetsos, Fotios; Kwon, Jee Young; Zhu, Qihui; Munson, Katherine M; Hasenfeld, Patrick; Harvey, William T; Lewis, Alexandra P; Kordosky, Jennifer; Hoekzema, Kendra; O'Neill, Rachel J; Korbel, Jan O; Tyler-Smith, Chris; Eichler, Evan E; Shi, Xinghua; Beck, Christine R; Marschall, Tobias; Konkel, Miriam K; Lee, Charles.

Nature ; 621(7978): 355-364, 2023 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-37612510

RESUMO

The prevalence of highly repetitive sequences within the human Y chromosome has prevented its complete assembly to date1 and led to its systematic omission from genomic analyses. Here we present de novo assemblies of 43 Y chromosomes spanning 182,900 years of human evolution and report considerable diversity in size and structure. Half of the male-specific euchromatic region is subject to large inversions with a greater than twofold higher recurrence rate compared with all other chromosomes2. Ampliconic sequences associated with these inversions show differing mutation rates that are sequence context dependent, and some ampliconic genes exhibit evidence for concerted evolution with the acquisition and purging of lineage-specific pseudogenes. The largest heterochromatic region in the human genome, Yq12, is composed of alternating repeat arrays that show extensive variation in the number, size and distribution, but retain a 1:1 copy-number ratio. Finally, our data suggest that the boundary between the recombining pseudoautosomal region 1 and the non-recombining portions of the X and Y chromosomes lies 500 kb away from the currently established1 boundary. The availability of fully sequence-resolved Y chromosomes from multiple individuals provides a unique opportunity for identifying new associations of traits with specific Y-chromosomal variants and garnering insights into the evolution and function of complex regions of the human genome.

Assuntos

Cromossomos Humanos Y , Evolução Molecular , Humanos , Masculino , Cromossomos Humanos Y/genética , Genoma Humano/genética , Genômica , Taxa de Mutação , Fenótipo , Eucromatina/genética , Pseudogenes , Variação Genética/genética , Cromossomos Humanos X/genética , Regiões Pseudoautossômicas/genética

Inversion polymorphism in a complete human genome assembly.

Porubsky, David; Harvey, William T; Rozanski, Allison N; Ebler, Jana; Höps, Wolfram; Ashraf, Hufsah; Hasenfeld, Patrick; Paten, Benedict; Sanders, Ashley D; Marschall, Tobias; Korbel, Jan O; Eichler, Evan E.

Genome Biol ; 24(1): 100, 2023 04 30.

Artigo em Inglês | MEDLINE | ID: mdl-37122002

RESUMO

The telomere-to-telomere (T2T) complete human reference has significantly improved our ability to characterize genome structural variation. To understand its impact on inversion polymorphisms, we remapped data from 41 genomes against the T2T reference genome and compared it to the GRCh38 reference. We find a ~ 21% increase in sensitivity improving mapping of 63 inversions on the T2T reference. We identify 26 misorientations within GRCh38 and show that the T2T reference is three times more likely to represent the correct orientation of the major human allele. Analysis of 10 additional samples reveals novel rare inversions at chromosomes 15q25.2, 16p11.2, 16q22.1-23.1, and 22q11.21.

Assuntos

Genoma Humano , Polimorfismo Genético , Humanos , Variação Estrutural do Genoma , Inversão Cromossômica

The third international hackathon for applying insights into large-scale genomic composition to use cases in a wide range of organisms.

Walker, Kimberly; Kalra, Divya; Lowdon, Rebecca; Chen, Guangyi; Molik, David; Soto, Daniela C; Dabbaghie, Fawaz; Khleifat, Ahmad Al; Mahmoud, Medhat; Paulin, Luis F; Raza, Muhammad Sohail; Pfeifer, Susanne P; Agustinho, Daniel Paiva; Aliyev, Elbay; Avdeyev, Pavel; Barrozo, Enrico R; Behera, Sairam; Billingsley, Kimberley; Chong, Li Chuin; Choubey, Deepak; De Coster, Wouter; Fu, Yilei; Gener, Alejandro R; Hefferon, Timothy; Henke, David Morgan; Höps, Wolfram; Illarionova, Anastasia; Jochum, Michael D; Jose, Maria; Kesharwani, Rupesh K; Kolora, Sree Rohit Raj; Kubica, Jedrzej; Lakra, Priya; Lattimer, Damaris; Liew, Chia-Sin; Lo, Bai-Wei; Lo, Chunhsuan; Lötter, Anneri; Majidian, Sina; Mendem, Suresh Kumar; Mondal, Rajarshi; Ohmiya, Hiroko; Parvin, Nasrin; Peralta, Carolina; Poon, Chi-Lam; Prabhakaran, Ramanandan; Saitou, Marie; Sammi, Aditi; Sanio, Philippe; Sapoval, Nicolae.

F1000Res ; 11: 530, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-36262335

RESUMO

In October 2021, 59 scientists from 14 countries and 13 U.S. states collaborated virtually in the Third Annual Baylor College of Medicine & DNANexus Structural Variation hackathon. The goal of the hackathon was to advance research on structural variants (SVs) by prototyping and iterating on open-source software. This led to nine hackathon projects focused on diverse genomics research interests, including various SV discovery and genotyping methods, SV sequence reconstruction, and clinically relevant structural variation, including SARS-CoV-2 variants. Repositories for the projects that participated in the hackathon are available at https://github.com/collaborativebioinformatics.

Assuntos

COVID-19 , SARS-CoV-2 , Humanos , SARS-CoV-2/genética , Genômica , Software

Recurrent inversion polymorphisms in humans associate with genetic instability and genomic disorders.

Porubsky, David; Höps, Wolfram; Ashraf, Hufsah; Hsieh, PingHsun; Rodriguez-Martin, Bernardo; Yilmaz, Feyza; Ebler, Jana; Hallast, Pille; Maria Maggiolini, Flavia Angela; Harvey, William T; Henning, Barbara; Audano, Peter A; Gordon, David S; Ebert, Peter; Hasenfeld, Patrick; Benito, Eva; Zhu, Qihui; Lee, Charles; Antonacci, Francesca; Steinrücken, Matthias; Beck, Christine R; Sanders, Ashley D; Marschall, Tobias; Eichler, Evan E; Korbel, Jan O.

Cell ; 185(11): 1986-2005.e26, 2022 05 26.

Artigo em Inglês | MEDLINE | ID: mdl-35525246

RESUMO

Unlike copy number variants (CNVs), inversions remain an underexplored genetic variation class. By integrating multiple genomic technologies, we discover 729 inversions in 41 human genomes. Approximately 85% of inversions <2 kbp form by twin-priming during L1 retrotransposition; 80% of the larger inversions are balanced and affect twice as many nucleotides as CNVs. Balanced inversions show an excess of common variants, and 72% are flanked by segmental duplications (SDs) or retrotransposons. Since flanking repeats promote non-allelic homologous recombination, we developed complementary approaches to identify recurrent inversion formation. We describe 40 recurrent inversions encompassing 0.6% of the genome, showing inversion rates up to 2.7 × 10-4 per locus per generation. Recurrent inversions exhibit a sex-chromosomal bias and co-localize with genomic disorder critical regions. We propose that inversion recurrence results in an elevated number of heterozygous carriers and structural SD diversity, which increases mutability in the population and predisposes specific haplotypes to disease-causing CNVs.

Assuntos

Inversão Cromossômica , Duplicações Segmentares Genômicas , Inversão Cromossômica/genética , Variações do Número de Cópias de DNA/genética , Genoma Humano , Genômica , Humanos

Haplotype-resolved diverse human genomes and integrated analysis of structural variation.

Ebert, Peter; Audano, Peter A; Zhu, Qihui; Rodriguez-Martin, Bernardo; Porubsky, David; Bonder, Marc Jan; Sulovari, Arvis; Ebler, Jana; Zhou, Weichen; Serra Mari, Rebecca; Yilmaz, Feyza; Zhao, Xuefang; Hsieh, PingHsun; Lee, Joyce; Kumar, Sushant; Lin, Jiadong; Rausch, Tobias; Chen, Yu; Ren, Jingwen; Santamarina, Martin; Höps, Wolfram; Ashraf, Hufsah; Chuang, Nelson T; Yang, Xiaofei; Munson, Katherine M; Lewis, Alexandra P; Fairley, Susan; Tallon, Luke J; Clarke, Wayne E; Basile, Anna O; Byrska-Bishop, Marta; Corvelo, André; Evani, Uday S; Lu, Tsung-Yu; Chaisson, Mark J P; Chen, Junjie; Li, Chong; Brand, Harrison; Wenger, Aaron M; Ghareghani, Maryam; Harvey, William T; Raeder, Benjamin; Hasenfeld, Patrick; Regier, Allison A; Abel, Haley J; Hall, Ira M; Flicek, Paul; Stegle, Oliver; Gerstein, Mark B; Tubio, Jose M C.

Science ; 372(6537)2021 04 02.

Artigo em Inglês | MEDLINE | ID: mdl-33632895

RESUMO

Long-read and strand-specific sequencing technologies together facilitate the de novo assembly of high-quality haplotype-resolved human genomes without parent-child trio data. We present 64 assembled haplotypes from 32 diverse human genomes. These highly contiguous haplotype assemblies (average minimum contig length needed to cover 50% of the genome: 26 million base pairs) integrate all forms of genetic variation, even across complex loci. We identified 107,590 structural variants (SVs), of which 68% were not discovered with short-read sequencing, and 278 SV hotspots (spanning megabases of gene-rich sequence). We characterized 130 of the most active mobile element source elements and found that 63% of all SVs arise through homology-mediated mechanisms. This resource enables reliable graph-based genotyping from short reads of up to 50,340 SVs, resulting in the identification of 1526 expression quantitative trait loci as well as SV candidates for adaptive selection within the human population.

Assuntos

Variação Genética , Genoma Humano , Haplótipos , Feminino , Genótipo , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Mutação INDEL , Sequências Repetitivas Dispersas , Masculino , Grupos Populacionais/genética , Locos de Características Quantitativas , Retroelementos , Análise de Sequência de DNA , Inversão de Sequência , Sequenciamento Completo do Genoma

Recurrent inversion toggling and great ape genome evolution.

Porubsky, David; Sanders, Ashley D; Höps, Wolfram; Hsieh, PingHsun; Sulovari, Arvis; Li, Ruiyang; Mercuri, Ludovica; Sorensen, Melanie; Murali, Shwetha C; Gordon, David; Cantsilieris, Stuart; Pollen, Alex A; Ventura, Mario; Antonacci, Francesca; Marschall, Tobias; Korbel, Jan O; Eichler, Evan E.

Nat Genet ; 52(8): 849-858, 2020 08.

Artigo em Inglês | MEDLINE | ID: mdl-32541924

RESUMO

Inversions play an important role in disease and evolution but are difficult to characterize because their breakpoints map to large repeats. We increased by sixfold the number (n = 1,069) of previously reported great ape inversions by using single-cell DNA template strand and long-read sequencing. We find that the X chromosome is most enriched (2.5-fold) for inversions, on the basis of its size and duplication content. There is an excess of differentially expressed primate genes near the breakpoints of large (>100 kilobases (kb)) inversions but not smaller events. We show that when great ape lineage-specific duplications emerge, they preferentially (approximately 75%) occur in an inverted orientation compared to that at their ancestral locus. We construct megabase-pair scale haplotypes for individual chromosomes and identify 23 genomic regions that have recurrently toggled between a direct and an inverted state over 15 million years. The direct orientation is most frequently the derived state for human polymorphisms that predispose to recurrent copy number variants associated with neurodevelopmental disease.

Assuntos

Inversão Cromossômica/genética , Genoma/genética , Hominidae/genética , Animais , Cromossomos/genética , Variações do Número de Cópias de DNA/genética , Evolução Molecular , Feminino , Haplótipos/genética , Humanos , Masculino

Gene Unprediction with Spurio: A tool to identify spurious protein sequences.

Höps, Wolfram; Jeffryes, Matt; Bateman, Alex.

F1000Res ; 7: 261, 2018.

Artigo em Inglês | MEDLINE | ID: mdl-29721311

RESUMO

We now have access to the sequences of tens of millions of proteins. These protein sequences are essential for modern molecular biology and computational biology. The vast majority of protein sequences are derived from gene prediction tools and have no experimental supporting evidence for their translation. Despite the increasing accuracy of gene prediction tools there likely exists a large number of spurious protein predictions in the sequence databases. We have developed the Spurio tool to help identify spurious protein predictions in prokaryotes. Spurio searches the query protein sequence against a prokaryotic nucleotide database using tblastn and identifies homologous sequences. The tblastn matches are used to score the query sequence's likelihood of being a spurious protein prediction using a Gaussian process model. The most informative feature is the appearance of stop codons within the presumed translation of homologous DNA sequences. Benchmarking shows that the Spurio tool is able to distinguish spurious from true proteins. However, transposon proteins are prone to be predicted as spurious because of the frequency of degraded homologs found in the DNA sequence databases. Our initial experiments suggest that less than 1% of the proteins in the UniProtKB sequence database are likely to be spurious and that Spurio is able to identify over 60 times more spurious proteins than the AntiFam resource. The Spurio software and source code is available under an MIT license at the following URL: https://bitbucket.org/bateman-group/spurio.

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA