Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 29
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
J Immunol ; 2024 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-39007649

RESUMO

The expressed Ab repertoire is a critical determinant of immune-related phenotypes. Ab-encoding transcripts are distinct from other expressed genes because they are transcribed from somatically rearranged gene segments. Human Abs are composed of two identical H and L chain polypeptides derived from genes in IGH locus and one of two L chain loci. The combinatorial diversity that results from Ab gene rearrangement and the pairing of different H and L chains contributes to the immense diversity of the baseline Ab repertoire. During rearrangement, Ab gene selection is mediated by factors that influence chromatin architecture, promoter/enhancer activity, and V(D)J recombination. Interindividual variation in the composition of the Ab repertoire associates with germline variation in IGH, implicating polymorphism in Ab gene regulation. Determining how IGH variants directly mediate gene regulation will require integration of these variants with other functional genomic datasets. In this study, we argue that standard approaches using short reads have limited utility for characterizing regulatory regions in IGH at haplotype resolution. Using simulated and chromatin immunoprecipitation sequencing reads, we define features of IGH that limit use of short reads and a single reference genome, namely 1) the highly duplicated nature of the DNA sequence in IGH and 2) structural polymorphisms that are frequent in the population. We demonstrate that personalized diploid references enhance performance of short-read data for characterizing mappable portions of the locus, while also showing that long-read profiling tools will ultimately be needed to fully resolve functional impacts of IGH germline variation on expressed Ab repertoires.

3.
Genes Immun ; 2024 Jun 06.
Artigo em Inglês | MEDLINE | ID: mdl-38844673

RESUMO

Immunoglobulins (IGs), critical components of the human immune system, are composed of heavy and light protein chains encoded at three genomic loci. The IG Kappa (IGK) chain locus consists of two large, inverted segmental duplications. The complexity of the IG loci has hindered use of standard high-throughput methods for characterizing genetic variation within these regions. To overcome these limitations, we use long-read sequencing to create haplotype-resolved IGK assemblies in an ancestrally diverse cohort (n = 36), representing the first comprehensive description of IGK haplotype variation. We identify extensive locus polymorphism, including novel single nucleotide variants (SNVs) and novel structural variants harboring functional IGKV genes. Among 47 functional IGKV genes, we identify 145 alleles, 67 of which were not previously curated. We report inter-population differences in allele frequencies for 10 IGKV genes, including alleles unique to specific populations within this dataset. We identify haplotypes carrying signatures of gene conversion that associate with SNV enrichment in the IGK distal region, and a haplotype with an inversion spanning the proximal and distal regions. These data provide a critical resource of curated genomic reference information from diverse ancestries, laying a foundation for advancing our understanding of population-level genetic variation in the IGK locus.

4.
Nat Immunol ; 25(6): 1073-1082, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38816615

RESUMO

A key barrier to the development of vaccines that induce broadly neutralizing antibodies (bnAbs) against human immunodeficiency virus (HIV) and other viruses of high antigenic diversity is the design of priming immunogens that induce rare bnAb-precursor B cells. The high neutralization breadth of the HIV bnAb 10E8 makes elicitation of 10E8-class bnAbs desirable; however, the recessed epitope within gp41 makes envelope trimers poor priming immunogens and requires that 10E8-class bnAbs possess a long heavy chain complementarity determining region 3 (HCDR3) with a specific binding motif. We developed germline-targeting epitope scaffolds with affinity for 10E8-class precursors and engineered nanoparticles for multivalent display. Scaffolds exhibited epitope structural mimicry and bound bnAb-precursor human naive B cells in ex vivo screens, protein nanoparticles induced bnAb-precursor responses in stringent mouse models and rhesus macaques, and mRNA-encoded nanoparticles triggered similar responses in mice. Thus, germline-targeting epitope scaffold nanoparticles can elicit rare bnAb-precursor B cells with predefined binding specificities and HCDR3 features.


Assuntos
Vacinas contra a AIDS , Anticorpos Neutralizantes , Anticorpos Anti-HIV , Proteína gp41 do Envelope de HIV , Infecções por HIV , HIV-1 , Macaca mulatta , Animais , Humanos , Proteína gp41 do Envelope de HIV/imunologia , Anticorpos Anti-HIV/imunologia , Camundongos , Vacinas contra a AIDS/imunologia , Anticorpos Neutralizantes/imunologia , HIV-1/imunologia , Infecções por HIV/imunologia , Infecções por HIV/prevenção & controle , Infecções por HIV/virologia , Vacinação , Anticorpos Amplamente Neutralizantes/imunologia , Linfócitos B/imunologia , Nanopartículas/química , Feminino , Regiões Determinantes de Complementaridade/imunologia , Epitopos/imunologia
5.
Science ; 384(6697): eadj8321, 2024 May 17.
Artigo em Inglês | MEDLINE | ID: mdl-38753769

RESUMO

Germline-targeting immunogens hold promise for initiating the induction of broadly neutralizing antibodies (bnAbs) to HIV and other pathogens. However, antibody-antigen recognition is typically dominated by heavy chain complementarity determining region 3 (HCDR3) interactions, and vaccine priming of HCDR3-dominant bnAbs by germline-targeting immunogens has not been demonstrated in humans or outbred animals. In this work, immunization with N332-GT5, an HIV envelope trimer designed to target precursors of the HCDR3-dominant bnAb BG18, primed bnAb-precursor B cells in eight of eight rhesus macaques to substantial frequencies and with diverse lineages in germinal center and memory B cells. We confirmed bnAb-mimicking, HCDR3-dominant, trimer-binding interactions with cryo-electron microscopy. Our results demonstrate proof of principle for HCDR3-dominant bnAb-precursor priming in outbred animals and suggest that N332-GT5 holds promise for the induction of similar responses in humans.


Assuntos
Vacinas contra a AIDS , Anticorpos Amplamente Neutralizantes , Regiões Determinantes de Complementaridade , Centro Germinativo , Anticorpos Anti-HIV , Animais , Humanos , Vacinas contra a AIDS/imunologia , Linfócitos B/imunologia , Anticorpos Amplamente Neutralizantes/imunologia , Regiões Determinantes de Complementaridade/imunologia , Microscopia Crioeletrônica , Produtos do Gene env do Vírus da Imunodeficiência Humana/imunologia , Centro Germinativo/imunologia , Anticorpos Anti-HIV/imunologia , Infecções por HIV/imunologia , Infecções por HIV/prevenção & controle , HIV-1/imunologia , Cadeias Pesadas de Imunoglobulinas/imunologia , Cadeias Pesadas de Imunoglobulinas/genética , Macaca mulatta , Células B de Memória/imunologia
7.
bioRxiv ; 2023 Nov 17.
Artigo em Inglês | MEDLINE | ID: mdl-38014266

RESUMO

Allelic variability in the adaptive immune receptor loci, which harbor the gene segments that encode B cell and T cell receptors (BCR/TCR), has been shown to be of critical importance for immune responses to pathogens and vaccines. In recent years, B cell and T cell receptor repertoire sequencing (Rep-Seq) has become widespread in immunology research making it the most readily available source of information about allelic diversity in immunoglobulin (IG) and T cell receptor (TR) loci in different populations. Here we present a novel algorithm for extra-sensitive and specific variable (V) and joining (J) gene allele inference and genotyping allowing reconstruction of individual high-quality gene segment libraries. The approach can be applied for inferring allelic variants from peripheral blood lymphocyte BCR and TCR repertoire sequencing data, including hypermutated isotype-switched BCR sequences, thus allowing high-throughput genotyping and novel allele discovery from a wide variety of existing datasets. The developed algorithm is a part of the MiXCR software ( https://mixcr.com ) and can be incorporated into any pipeline utilizing upstream processing with MiXCR. We demonstrate the accuracy of this approach using Rep-Seq paired with long-read genomic sequencing data, comparing it to a widely used algorithm, TIgGER. We applied the algorithm to a large set of IG heavy chain (IGH) Rep-Seq data from 450 donors of ancestrally diverse population groups, and to the largest reported full-length TCR alpha and beta chain (TRA; TRB) Rep-Seq dataset, representing 134 individuals. This allowed us to assess the genetic diversity of genes within the IGH, TRA and TRB loci in different populations and demonstrate the connection between antibody repertoire gene usage and the number of allelic variants present in the population. Finally we established a database of allelic variants of V and J genes inferred from Rep-Seq data and their population frequencies with free public access at https://vdj.online .

8.
Nucleic Acids Res ; 51(16): e86, 2023 09 08.
Artigo em Inglês | MEDLINE | ID: mdl-37548401

RESUMO

In adaptive immune receptor repertoire analysis, determining the germline variable (V) allele associated with each T- and B-cell receptor sequence is a crucial step. This process is highly impacted by allele annotations. Aligning sequences, assigning them to specific germline alleles, and inferring individual genotypes are challenging when the repertoire is highly mutated, or sequence reads do not cover the whole V region. Here, we propose an alternative naming scheme for the V alleles, as well as a novel method to infer individual genotypes. We demonstrate the strengths of the two by comparing their outcomes to other genotype inference methods. We validate the genotype approach with independent genomic long-read data. The naming scheme is compatible with current annotation tools and pipelines. Analysis results can be converted from the proposed naming scheme to the nomenclature determined by the International Union of Immunological Societies (IUIS). Both the naming scheme and the genotype procedure are implemented in a freely available R package (PIgLET https://bitbucket.org/yaarilab/piglet). To allow researchers to further explore the approach on real data and to adapt it for their uses, we also created an interactive website (https://yaarilab.github.io/IGHV_reference_book).


Assuntos
Genômica , Cadeias Pesadas de Imunoglobulinas , Receptores de Antígenos de Linfócitos B , Alelos , Genótipo , Receptores de Antígenos de Linfócitos B/genética , Cadeias Pesadas de Imunoglobulinas/genética
9.
Nat Commun ; 14(1): 4419, 2023 07 21.
Artigo em Inglês | MEDLINE | ID: mdl-37479682

RESUMO

Variation in the antibody response has been linked to differential outcomes in disease, and suboptimal vaccine and therapeutic responsiveness, the determinants of which have not been fully elucidated. Countering models that presume antibodies are generated largely by stochastic processes, we demonstrate that polymorphisms within the immunoglobulin heavy chain locus (IGH) impact the naive and antigen-experienced antibody repertoire, indicating that genetics predisposes individuals to mount qualitatively and quantitatively different antibody responses. We pair recently developed long-read genomic sequencing methods with antibody repertoire profiling to comprehensively resolve IGH genetic variation, including novel structural variants, single nucleotide variants, and genes and alleles. We show that IGH germline variants determine the presence and frequency of antibody genes in the expressed repertoire, including those enriched in functional elements linked to V(D)J recombination, and overlapping disease-associated variants. These results illuminate the power of leveraging IGH genetics to better understand the regulation, function, and dynamics of the antibody response in disease.


Assuntos
Genes de Cadeia Pesada de Imunoglobulina , Genes de Imunoglobulinas , Humanos , Genes de Cadeia Pesada de Imunoglobulina/genética , Alelos , Mutação em Linhagem Germinativa , Cadeias Pesadas de Imunoglobulinas/genética
10.
J Immunol ; 210(10): 1607-1619, 2023 05 15.
Artigo em Inglês | MEDLINE | ID: mdl-37027017

RESUMO

Current Adaptive Immune Receptor Repertoire sequencing (AIRR-seq) using short-read sequencing strategies resolve expressed Ab transcripts with limited resolution of the C region. In this article, we present the near-full-length AIRR-seq (FLAIRR-seq) method that uses targeted amplification by 5' RACE, combined with single-molecule, real-time sequencing to generate highly accurate (99.99%) human Ab H chain transcripts. FLAIRR-seq was benchmarked by comparing H chain V (IGHV), D (IGHD), and J (IGHJ) gene usage, complementarity-determining region 3 length, and somatic hypermutation to matched datasets generated with standard 5' RACE AIRR-seq using short-read sequencing and full-length isoform sequencing. Together, these data demonstrate robust FLAIRR-seq performance using RNA samples derived from PBMCs, purified B cells, and whole blood, which recapitulated results generated by commonly used methods, while additionally resolving H chain gene features not documented in IMGT at the time of submission. FLAIRR-seq data provide, for the first time, to our knowledge, simultaneous single-molecule characterization of IGHV, IGHD, IGHJ, and IGHC region genes and alleles, allele-resolved subisotype definition, and high-resolution identification of class switch recombination within a clonal lineage. In conjunction with genomic sequencing and genotyping of IGHC genes, FLAIRR-seq of the IgM and IgG repertoires from 10 individuals resulted in the identification of 32 unique IGHC alleles, 28 (87%) of which were previously uncharacterized. Together, these data demonstrate the capabilities of FLAIRR-seq to characterize IGHV, IGHD, IGHJ, and IGHC gene diversity for the most comprehensive view of bulk-expressed Ab repertoires to date.


Assuntos
Regiões Determinantes de Complementaridade , Humanos , Regiões Determinantes de Complementaridade/genética , Sequência de Bases
11.
Genes Immun ; 24(1): 21-31, 2023 02.
Artigo em Inglês | MEDLINE | ID: mdl-36539592

RESUMO

Immunoglobulins (IGs), crucial components of the adaptive immune system, are encoded by three genomic loci. However, the complexity of the IG loci severely limits the effective use of short read sequencing, limiting our knowledge of population diversity in these loci. We leveraged existing long read whole-genome sequencing (WGS) data, fosmid technology, and IG targeted single-molecule, real-time (SMRT) long-read sequencing (IG-Cap) to create haplotype-resolved assemblies of the IG Lambda (IGL) locus from 6 ethnically diverse individuals. In addition, we generated 10 diploid assemblies of IGL from a diverse cohort of individuals utilizing IG-Cap. From these 16 individuals, we identified significant allelic diversity, including 36 novel IGLV alleles. In addition, we observed highly elevated single nucleotide variation (SNV) in IGLV genes relative to IGL intergenic and genomic background SNV density. By comparing SNV calls between our high quality assemblies and existing short read datasets from the same individuals, we show a high propensity for false-positives in the short read datasets. Finally, for the first time, we nucleotide-resolved common 5-10 Kb duplications in the IGLC region that contain functional IGLJ and IGLC genes. Together these data represent a significant advancement in our understanding of genetic variation and population diversity in the IGL locus.


Assuntos
Genes de Imunoglobulinas , Cadeias lambda de Imunoglobulina , Humanos , Cadeias lambda de Imunoglobulina/genética , Genômica , Variação Genética , Nucleotídeos
12.
Trends Immunol ; 44(1): 7-21, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-36470826

RESUMO

The recombination between immunoglobulin (IG) gene segments determines an individual's naïve antibody repertoire and, consequently, (auto)antigen recognition. Emerging evidence suggests that mammalian IG germline variation impacts humoral immune responses associated with vaccination, infection, and autoimmunity - from the molecular level of epitope specificity, up to profound changes in the architecture of antibody repertoires. These links between IG germline variants and immunophenotype raise the question on the evolutionary causes and consequences of diversity within IG loci. We discuss why the extreme diversity in IG loci remains a mystery, why resolving this is important for the design of more effective vaccines and therapeutics, and how recent evidence from multiple lines of inquiry may help us do so.


Assuntos
Genes de Imunoglobulinas , Mutação em Linhagem Germinativa , Animais , Humanos , Genes de Imunoglobulinas/genética , Imunidade Humoral/genética , Evolução Biológica , Células Germinativas , Mamíferos
13.
Front Immunol ; 14: 1330153, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38406579

RESUMO

Introduction: Analysis of an individual's immunoglobulin (IG) gene repertoire requires the use of high-quality germline gene reference sets. When sets only contain alleles supported by strong evidence, AIRR sequencing (AIRR-seq) data analysis is more accurate and studies of the evolution of IG genes, their allelic variants and the expressed immune repertoire is therefore facilitated. Methods: The Adaptive Immune Receptor Repertoire Community (AIRR-C) IG Reference Sets have been developed by including only human IG heavy and light chain alleles that have been confirmed by evidence from multiple high-quality sources. To further improve AIRR-seq analysis, some alleles have been extended to deal with short 3' or 5' truncations that can lead them to be overlooked by alignment utilities. To avoid other challenges for analysis programs, exact paralogs (e.g. IGHV1-69*01 and IGHV1-69D*01) are only represented once in each set, though alternative sequence names are noted in accompanying metadata. Results and discussion: The Reference Sets include less than half the previously recognised IG alleles (e.g. just 198 IGHV sequences), and also include a number of novel alleles: 8 IGHV alleles, 2 IGKV alleles and 5 IGLV alleles. Despite their smaller sizes, erroneous calls were eliminated, and excellent coverage was achieved when a set of repertoires comprising over 4 million V(D)J rearrangements from 99 individuals were analyzed using the Sets. The version-tracked AIRR-C IG Reference Sets are freely available at the OGRDB website (https://ogrdb.airr-community.org/germline_sets/Human) and will be regularly updated to include newly observed and previously reported sequences that can be confirmed by new high-quality data.


Assuntos
Genes de Imunoglobulinas , Imunoglobulinas , Humanos , Imunoglobulinas/genética , Alelos , Recombinação V(D)J/genética , Células Germinativas
14.
Am J Hum Genet ; 109(6): 1065-1076, 2022 06 02.
Artigo em Inglês | MEDLINE | ID: mdl-35609568

RESUMO

The human genome contains tens of thousands of large tandem repeats and hundreds of genes that show common and highly variable copy-number changes. Due to their large size and repetitive nature, these variable number tandem repeats (VNTRs) and multicopy genes are generally recalcitrant to standard genotyping approaches and, as a result, this class of variation is poorly characterized. However, several recent studies have demonstrated that copy-number variation of VNTRs can modify local gene expression, epigenetics, and human traits, indicating that many have a functional role. Here, using read depth from whole-genome sequencing to profile copy number, we report results of a phenome-wide association study (PheWAS) of VNTRs and multicopy genes in a discovery cohort of ∼35,000 samples, identifying 32 traits associated with copy number of 38 VNTRs and multicopy genes at 1% FDR. We replicated many of these signals in an independent cohort and observed that VNTRs showing trait associations were significantly enriched for expression QTLs with nearby genes, providing strong support for our results. Fine-mapping studies indicated that in the majority (∼90%) of cases, the VNTRs and multicopy genes we identified represent the causal variants underlying the observed associations. Furthermore, several lie in regions where prior SNV-based GWASs have failed to identify any significant associations with these traits. Our study indicates that copy number of VNTRs and multicopy genes contributes to diverse human traits and suggests that complex structural variants potentially explain some of the so-called "missing heritability" of SNV-based GWASs.


Assuntos
Variações do Número de Cópias de DNA , Repetições Minissatélites , Variações do Número de Cópias de DNA/genética , Genoma Humano , Estudo de Associação Genômica Ampla , Humanos , Repetições Minissatélites/genética , Fenótipo
15.
Genome Med ; 14(1): 2, 2022 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-34991709

RESUMO

BACKGROUND: T and B cell receptor (TCR, BCR) repertoires constitute the foundation of adaptive immunity. Adaptive immune receptor repertoire sequencing (AIRR-seq) is a common approach to study immune system dynamics. Understanding the genetic factors influencing the composition and dynamics of these repertoires is of major scientific and clinical importance. The chromosomal loci encoding for the variable regions of TCRs and BCRs are challenging to decipher due to repetitive elements and undocumented structural variants. METHODS: To confront this challenge, AIRR-seq-based methods have recently been developed for B cells, enabling genotype and haplotype inference and discovery of undocumented alleles. However, this approach relies on complete coverage of the receptors' variable regions, whereas most T cell studies sequence a small fraction of that region. Here, we adapted a B cell pipeline for undocumented alleles, genotype, and haplotype inference for full and partial AIRR-seq TCR data sets. The pipeline also deals with gene assignment ambiguities, which is especially important in the analysis of data sets of partial sequences. RESULTS: From the full and partial AIRR-seq TCR data sets, we identified 39 undocumented polymorphisms in T cell receptor Beta V (TRBV) and 31 undocumented 5 ' UTR sequences. A subset of these inferences was also observed using independent genomic approaches. We found that a single nucleotide polymorphism differentiating between the two documented T cell receptor Beta D2 (TRBD2) alleles is strongly associated with dramatic changes in the expressed repertoire. CONCLUSIONS: We reveal a rich picture of germline variability and demonstrate how a single nucleotide polymorphism dramatically affects the composition of the whole repertoire. Our findings provide a basis for annotation of TCR repertoires for future basic and clinical studies.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Receptores de Antígenos de Linfócitos T alfa-beta , Alelos , Células Germinativas , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Receptores de Antígenos de Linfócitos T/genética , Receptores de Antígenos de Linfócitos T alfa-beta/genética
16.
Cell Genom ; 2(12): 100228, 2022 Dec 14.
Artigo em Inglês | MEDLINE | ID: mdl-36778049

RESUMO

T cell receptors (TCRs) recognize peptide fragments presented by the major histocompatibility complex (MHC) and are critical to T cell-mediated immunity. Recent data have indicated that genetic diversity within TCR-encoding gene regions is underexplored, limiting understanding of the impact of TCR loci polymorphisms on TCR function in disease, even though TCR repertoire signatures (1) are heritable and (2) associate with disease phenotypes. To address this, we developed a targeted long-read sequencing approach to generate highly accurate haplotype resolved assemblies of the TCR beta (TRB) and alpha/delta (TRA/D) loci, facilitating the genotyping of all variant types, including structural variants. We validate our approach using two mother-father-child trios and 5 unrelated donors representing multiple populations. This resulted in improved genotyping accuracy and the discovery of 84 undocumented V, D, J, and C alleles, demonstrating the utility of this framework for improving our understanding of TCR diversity and function in disease.

17.
PLoS One ; 16(12): e0261374, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34898642

RESUMO

Lymphoblastoid cell lines (LCLs) have been critical to establishing genetic resources for biomedical science. They have been used extensively to study human genetic diversity, genome function, and inform the development of tools and methodologies for augmenting disease genetics research. While the validity of variant callsets from LCLs has been demonstrated for most of the genome, previous work has shown that DNA extracted from LCLs is modified by V(D)J recombination within the immunoglobulin (IG) loci, regions that harbor antibody genes critical to immune system function. However, the impacts of V(D)J on short read sequencing data generated from LCLs has not been extensively investigated. In this study, we used LCL-derived short read sequencing data from the 1000 Genomes Project (n = 2,504) to identify signatures of V(D)J recombination. Our analyses revealed sample-level impacts of V(D)J recombination that varied depending on the degree of inferred monoclonality. We showed that V(D)J associated somatic deletions impacted genotyping accuracy, leading to adulterated population-level estimates of allele frequency and linkage disequilibrium. These findings illuminate limitations of using LCLs and short read data for building genetic resources in the IG loci, with implications for interpreting previous disease association studies in these regions.


Assuntos
Leucemia-Linfoma Linfoblástico de Células Precursoras/genética , Leucemia-Linfoma Linfoblástico de Células Precursoras/metabolismo , Recombinação V(D)J/genética , Alelos , Linfócitos B/imunologia , Linhagem Celular Tumoral/imunologia , Linhagem Celular Tumoral/metabolismo , Bases de Dados Genéticas , Frequência do Gene/genética , Genes de Imunoglobulinas/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Região Variável de Imunoglobulina/genética , Imunoglobulinas/genética , Imunoglobulinas/imunologia , Análise de Sequência de DNA/métodos , Recombinação V(D)J/imunologia
19.
Am J Hum Genet ; 108(5): 809-824, 2021 05 06.
Artigo em Inglês | MEDLINE | ID: mdl-33794196

RESUMO

Variable number tandem repeats (VNTRs) are composed of large tandemly repeated motifs, many of which are highly polymorphic in copy number. However, because of their large size and repetitive nature, they remain poorly studied. To investigate the regulatory potential of VNTRs, we used read-depth data from Illumina whole-genome sequencing to perform association analysis between copy number of ∼70,000 VNTRs (motif size ≥ 10 bp) with both gene expression (404 samples in 48 tissues) and DNA methylation (235 samples in peripheral blood), identifying thousands of VNTRs that are associated with local gene expression (eVNTRs) and DNA methylation levels (mVNTRs). Using an independent cohort, we validated 73%-80% of signals observed in the two discovery cohorts, while allelic analysis of VNTR length and CpG methylation in 30 Oxford Nanopore genomes gave additional support for mVNTR loci, thus providing robust evidence to support that these represent genuine associations. Further, conditional analysis indicated that many eVNTRs and mVNTRs act as QTLs independently of other local variation. We also observed strong enrichments of eVNTRs and mVNTRs for regulatory features such as enhancers and promoters. Using the Human Genome Diversity Panel, we define sets of VNTRs that show highly divergent copy numbers among human populations and show that these are enriched for regulatory effects and preferentially associate with genes that have been linked with human phenotypes through GWASs. Our study provides strong evidence supporting functional variation at thousands of VNTRs and defines candidate sets of VNTRs, copy number variation of which potentially plays a role in numerous human phenotypes.


Assuntos
Variações do Número de Cópias de DNA/genética , Metilação de DNA , Regulação da Expressão Gênica , Repetições Minissatélites/genética , Locos de Características Quantitativas/genética , Adolescente , Adulto , Algoritmos , Criança , Pré-Escolar , Cromossomos Humanos X/genética , Estudos de Coortes , Ilhas de CpG/genética , Elementos Facilitadores Genéticos/genética , Feminino , Estudo de Associação Genômica Ampla , Genótipo , Humanos , Lactente , Recém-Nascido , Masculino , Pessoa de Meia-Idade , Fenótipo , Regiões Promotoras Genéticas/genética , Adulto Jovem
20.
Front Immunol ; 11: 2136, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33072076

RESUMO

An incomplete ascertainment of genetic variation within the highly polymorphic immunoglobulin heavy chain locus (IGH) has hindered our ability to define genetic factors that influence antibody-mediated processes. Due to locus complexity, standard high-throughput approaches have failed to accurately and comprehensively capture IGH polymorphism. As a result, the locus has only been fully characterized two times, severely limiting our knowledge of human IGH diversity. Here, we combine targeted long-read sequencing with a novel bioinformatics tool, IGenotyper, to fully characterize IGH variation in a haplotype-specific manner. We apply this approach to eight human samples, including a haploid cell line and two mother-father-child trios, and demonstrate the ability to generate high-quality assemblies (>98% complete and >99% accurate), genotypes, and gene annotations, identifying 2 novel structural variants and 15 novel IGH alleles. We show multiplexing allows for scaling of the approach without impacting data quality, and that our genotype call sets are more accurate than short-read (>35% increase in true positives and >97% decrease in false-positives) and array/imputation-based datasets. This framework establishes a desperately needed foundation for leveraging IG genomic data to study population-level variation in antibody-mediated immunity, critical for bettering our understanding of disease risk, and responses to vaccines and therapeutics.


Assuntos
Biologia Computacional/métodos , Genes de Imunoglobulinas , Variação Genética , Técnicas de Genotipagem , Haplótipos/genética , Cadeias Pesadas de Imunoglobulinas/genética , Polimorfismo Genético , Linhagem Celular , Apresentação de Dados , Conjuntos de Dados como Assunto , Família , Biblioteca Gênica , Variação Estrutural do Genoma , Humanos , Anotação de Sequência Molecular , Alinhamento de Sequência , Análise de Sequência de DNA , Homologia de Sequência do Ácido Nucleico , Interface Usuário-Computador , Fluxo de Trabalho
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...