Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 40
Filtrar
1.
Int J Mol Sci ; 25(9)2024 Apr 26.
Artigo em Inglês | MEDLINE | ID: mdl-38731932

RESUMO

The serious drawback underlying the biological annotation of whole-genome sequence data is the p >> n problem, which means that the number of polymorphic variants (p) is much larger than the number of available phenotypic records (n). We propose a way to circumvent the problem by combining a LASSO logistic regression with deep learning to classify cows as susceptible or resistant to mastitis, based on single nucleotide polymorphism (SNP) genotypes. Among several architectures, the one with 204,642 SNPs was selected as the best. This architecture was composed of two layers with, respectively, 7 and 46 units per layer implementing respective drop-out rates of 0.210 and 0.358. The classification of the test data resulted in AUC = 0.750, accuracy = 0.650, sensitivity = 0.600, and specificity = 0.700. Significant SNPs were selected based on the SHapley Additive exPlanation (SHAP). As a final result, one GO term related to the biological process and thirteen GO terms related to molecular function were significantly enriched in the gene set that corresponded to the significant SNPs. Our findings revealed that the optimal approach can correctly predict susceptibility or resistance status for approximately 65% of cows. Genes marked by the most significant SNPs are related to the immune response and protein synthesis.


Assuntos
Aprendizado Profundo , Mastite Bovina , Polimorfismo de Nucleotídeo Único , Sequenciamento Completo do Genoma , Bovinos , Mastite Bovina/genética , Animais , Feminino , Sequenciamento Completo do Genoma/métodos , Predisposição Genética para Doença , Genótipo
2.
NAR Genom Bioinform ; 6(2): lqae040, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38686136

RESUMO

This study compared computational approaches to parallelization of an SNP calling workflow. The data comprised DNA from five Holstein-Friesian cows sequenced with the Illumina platform. The pipeline consisted of quality control, alignment to the reference genome, post-alignment, and SNP calling. Three approaches to parallelization were compared: (i) a plain Bash script in which a pipeline for each cow was executed as separate processes invoked at the same time, (ii) a Bash script wrapped in a single Nextflow process and (iii) a Nextflow script with each component of the pipeline defined as a separate process. The results demonstrated that on average, the multi-process Nextflow script performed 15-27% faster depending on the number of assigned threads, with the biggest execution time advantage over the plain Bash approach observed with 10 threads. In terms of RAM usage, the most substantial variation was observed for the multi-process Nextflow, for which it increased with the number of assigned threads, while RAM consumption of the other setups did not depend much on the number of threads assigned for computations. Due to intermediate and log files generated, disk usage was markedly higher for the multi-process Nextflow than for the plain Bash and for the single-process Nextflow.

3.
J Appl Genet ; 2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38539022

RESUMO

Recently, numerous studies including various tissues have been carried out on long non-coding RNAs (lncRNAs), but still, its variability has not yet been fully understood. In this study, we characterised the inter-individual variability of lncRNAs in pigs, in the context of number, length and expression. Transcriptomes collected from muscle tissue belonging to six Polish Landrace boars (PL1-PL6), including half-brothers (PL1-PL3), were investigated using bioinformatics (lncRNA identification and functional analysis) and statistical (lncRNA variability) methods. The number of lncRNA ranged from 1289 to 3500 per animal, and the total number of common lncRNAs among all boars was 232. The number, length and expression of lncRNAs significantly varied between individuals, and no consistent pattern has been found between pairs of half-brothers. In detail, PL5 exhibits lower expression than the others, while PL4 has significantly higher expression than PL2-PL3 and PL5-PL6. Noteworthy, comparing the inter-individual variability of lncRNA and mRNA expression, they exhibited concordant patterns. The enrichment analysis for common lncRNA target genes determined a variety of biological processes that play fundamental roles in cell biology, and they were mostly related to whole-body homeostasis maintenance, energy and protein synthesis as well as dynamics of multiple nucleoprotein complexes. The high variability of lncRNA landscape in the porcine genome has been revealed in this study. The inter-individual differences have been found in the context of three aspects: the number, length and expression of lncRNAs, which contribute to a better understanding of its complex nature.

4.
Pol Arch Intern Med ; 134(3)2024 03 27.
Artigo em Inglês | MEDLINE | ID: mdl-38165228

RESUMO

INTRODUCTION: Genome sequencing technologies reveal molecular mechanisms of differentiated thyroid cancer (DTC). Unlike somatic mutation analysis from thyroidectomy samples, germline mutations showing genetic susceptibility to DTC are less understood. OBJECTIVES: The study aimed to assess the prevalence of germline mutations predisposing to DTC in a cohort of Polish individuals based on their whole genome sequencing data. PATIENTS AND METHODS: We analyzed sequencing data from 1076 unrelated individuals totaling over 1018 billion read pairs and yielding an average 35.26 × read depth per genome, released openly for academic and clinical research as the Thousand Polish Genomes database (https://1000polishgenomes.com). The list of genes chosen for further analysis was based on the review of previous studies. RESULTS: The cohort contained 104 variants located within the coding and noncoding DNA sequences of 90 genes selected by ClinVar classification as pathogenic and potentially pathogenic. The frequency of variants in the Polish cohort was compared with the frequency estimated for the non­Finnish European population obtained from the gnomAD database (gnomad.broadinstitute.org). Significant differences in variant frequency were found for the APC, ARSB, ATM, BRCA1, CHEK2, DICER1, GPD1L, INSR, KCNJ10, MYH9, PALB2, PLCB1, PLEKHG5, PTEN, RET, SEC23B, SERPINA1, SLC26A4, SMAD3, STK11, TERT, TOE1, and WRN genes. CONCLUSIONS: Even though the Polish population is genetically similar to the other European populations, there are significant differences in variant frequencies contributing to the disease development and progression, such as those in the RET, CHEK2, BRCA1, SLC26A4, or TERT genes. Further studies are needed to identify genomic variants associated directly with DTC.


Assuntos
Adenocarcinoma , Neoplasias da Glândula Tireoide , Humanos , Polônia , Predisposição Genética para Doença , Mutação em Linhagem Germinativa , Ribonuclease III/genética , RNA Helicases DEAD-box/genética , Proteínas Nucleares/genética
5.
Genet Sel Evol ; 55(1): 82, 2023 Nov 23.
Artigo em Inglês | MEDLINE | ID: mdl-37996798

RESUMO

BACKGROUND: The single-step model is becoming increasingly popular for national genetic evaluations of dairy cattle due to the benefits that it offers such as joint breeding value estimation for genotyped and ungenotyped animals. However, the complexity of the model due to a large number of correlated effects can lead to significant computational challenges, especially in terms of accuracy and efficiency of the preconditioned conjugate gradient method used for the estimation. The aim of this study was to investigate the effect of pedigree depth on the model's overall convergence rate as well as on the convergence of different components of the model, in the context of the single-step single nucleotide polymorphism best linear unbiased prediction (SNP-BLUP) model. RESULTS: The results demonstrate that the dataset with a truncated pedigree converged twice as fast as the full dataset. Still, both datasets showed very high Pearson correlations between predicted breeding values. In addition, by comparing the top 50 bulls between the two datasets we found a high correlation between their rankings. We also analysed the specific convergence patterns underlying different animal groups and model effects, which revealed heterogeneity in convergence behaviour. Effects of SNPs converged the fastest while those of genetic groups converged the slowest, which reflects the difference in information content available in the dataset for those effects. Pre-selection criteria for the SNP set based on minor allele frequency had no impact on either the rate or pattern of their convergence. Among different groups of individuals, genotyped animals with phenotype data converged the fastest, while non-genotyped animals without own records required the largest number of iterations. CONCLUSIONS: We conclude that pedigree structure markedly impacts the convergence rate of the optimisation which is more efficient for the truncated than for the full dataset.


Assuntos
Genômica , Polimorfismo de Nucleotídeo Único , Humanos , Masculino , Bovinos/genética , Animais , Genômica/métodos , Modelos Genéticos , Genótipo , Fenótipo , Linhagem
6.
Front Oncol ; 13: 1045817, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36845707

RESUMO

Introduction: Population-based cancer screening has raised many controversies in recent years, not only regarding the costs but also regarding the ethical nature and issues related to variant interpretation. Nowadays, genetic cancer screening standards are different in every country and usually encompass only individuals with a personal or family history of relevant cancer. Methods: Here we performed a broad genetic screening for cancer-related rare germline variants on population data from the Thousand Polish Genomes database based on 1076 Polish unrelated individuals that underwent whole genome sequencing (WGS). Results: We identified 19 551 rare variants in 806 genes related to oncological diseases, among them 89% have been located in non-coding regions. The combined BRCA1/BRCA2 pathogenic/likely pathogenic according to ClinVar allele frequency in the unselected population of 1076 Poles was 0.42%, corresponding to nine carriers. Discussion: Altogether, on the population level, we found especially problematic the assessment of the pathogenicity of variants and the relation of ACMG guidelines to the population frequency. Some of the variants may be overinterpreted as disease-causing due to their rarity or lack of annotation in the databases. On the other hand, some relevant variants may have been overseen given that there is little pooled population whole genome data on oncology. Before population WGS screening will become a standard, further studies are needed to assess the frequency of the variants suspected to be pathogenic on the population level and with reporting of likely benign variants.

7.
Cancers (Basel) ; 15(3)2023 Jan 27.
Artigo em Inglês | MEDLINE | ID: mdl-36765737

RESUMO

The number of cases of pancreatic cancers in 2019 in Poland was 3852 (approx. 2% of all cancers). The course of the disease is very fast, and the average survival time from the diagnosis is 6 months. Only <2% of patients live for 5 years from the diagnosis, 8% live for 2 years, and almost half live for only about 3 months. A family predisposition to pancreatic cancer occurs in about 10% of cases. Several oncogenes in which somatic changes lead to the development of tumours, including genes BRCA1/2 and PALB2, TP53, CDKN2A, SMAD4, MLL3, TGFBR2, ARID1A and SF3B1, are involved in pancreatic cancer. Between 4% and 10% of individuals with pancreatic cancer will have a mutation in one of these genes. Six percent of patients with pancreatic cancer have NTRK pathogenic fusion. The pathogenesis of pancreatic cancer can in many cases be characterised by homologous recombination deficiency (HRD)-cell inability to effectively repair DNA. It is estimated that from 24% to as many as 44% of pancreatic cancers show HRD. The most common cause of HRD are inactivating mutations in the genes regulating this DNA repair system, mainly BRCA1 and BRCA2, but also PALB2, RAD51C and several dozen others.

8.
PLoS One ; 18(1): e0279356, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36662838

RESUMO

Undoubtedly, genetic factors play an important role in susceptibility and resistance to COVID-19. In this study, we conducted the GWAS analysis. Out of 15,489,173 SNPs, we identified 18,191 significant SNPs for severe and 11,799 SNPs for resistant phenotype, showing that a great number of loci were significant in different COVID-19 representations. The majority of variants were synonymous (60.56% for severe, 58.46% for resistant phenotype) or located in introns (55.77% for severe, 59.83% for resistant phenotype). We identified the most significant SNPs for a severe outcome (in AJAP1 intron) and for COVID resistance (in FIG4 intron). We found no missense variants with a potential causal function on resistance to COVID-19; however, two missense variants were determined as significant a severe phenotype (in PM20D1 and LRP4 exons). None of the aforementioned SNPs and missense variants found in this study have been previously associated with COVID-19.


Assuntos
COVID-19 , Estudo de Associação Genômica Ampla , Humanos , COVID-19/genética , Fenótipo , Mutação de Sentido Incorreto , Éxons , Polimorfismo de Nucleotídeo Único , Predisposição Genética para Doença , Flavoproteínas/genética , Monoéster Fosfórico Hidrolases/genética
9.
BMC Public Health ; 23(1): 148, 2023 01 21.
Artigo em Inglês | MEDLINE | ID: mdl-36681790

RESUMO

BACKGROUND: One of the seminal events since 2019 has been the outbreak of the SARS-CoV-2 pandemic. Countries have adopted various policies to deal with it, but they also differ in their socio-geographical characteristics and public health care facilities. Our study aimed to investigate differences between epidemiological parameters across countries. METHOD: The analysed data represents SARS-CoV-2 repository provided by the Johns Hopkins University. Separately for each country, we estimated recovery and mortality rates using the SIRD model applied to the first 30, 60, 150, and 300 days of the pandemic. Moreover, a mixture of normal distributions was fitted to the number of confirmed cases and deaths during the first 300 days. The estimates of peaks' means and variances were used to identify countries with outlying parameters. RESULTS: For 300 days Belgium, Cyprus, France, the Netherlands, Serbia, and the UK were classified as outliers by all three outlier detection methods. Yemen was classified as an outlier for each of the four considered timeframes, due to high mortality rates. During the first 300 days of the pandemic, the majority of countries underwent three peaks in the number of confirmed cases, except Australia and Kazakhstan with two peaks. CONCLUSIONS: Considering recovery and mortality rates we observed heterogeneity between countries. Liechtenstein was the "positive" outlier with low mortality rates and high recovery rates, at the opposite, Yemen represented a "negative" outlier with high mortality for all four considered periods and low recovery for 30 and 60 days.


Assuntos
COVID-19 , SARS-CoV-2 , Humanos , COVID-19/epidemiologia , Pandemias , Surtos de Doenças , França
10.
Genet Sel Evol ; 54(1): 80, 2022 Dec 16.
Artigo em Inglês | MEDLINE | ID: mdl-36526979

RESUMO

Genome-wide association studies (GWAS) help identify polymorphic sites or genes linked to phenotypic variance, but a few identified genes and/or single nucleotide polymorphisms (SNPs) are unlikely to explain a large part of the phenotypic variability of complex traits. In this study, the focus was moved from single loci to functional units, expressed by the metabolic pathways as defined in the Kyoto Encyclopaedia of Genes and Genomes (KEGG) database. Consequently, the aim of this study was to estimate KEGG effects on stature in three Nordic dairy cattle breeds using SNP effects from GWAS as the dependent variable. The SNPs were annotated to genes, then the genes to KEGG pathways. The effects of KEGG pathways were estimated separately for each breed using a mixed linear model incorporating the similarity between pathways expressed by common genes. The KEGG pathway D-amino acid metabolism (map00473) was estimated to be significant for stature in two of the analysed breeds and revealed a borderline significance in the third breed. Thus, we demonstrate that the approach to statistical modelling of higher order functional effects on complex traits is useful, and provides evidence of the importance of D-amino acids for growth in cattle.


Assuntos
Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Bovinos/genética , Animais , Estudo de Associação Genômica Ampla/veterinária , Modelos Lineares , Locos de Características Quantitativas , Herança Multifatorial
11.
Front Microbiol ; 13: 998093, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36504790

RESUMO

Climate change affects animal physiology. In particular, rising ambient temperatures reduce animal vitality due to heat stress and this can be observed at various levels which included genome, transcriptome, and microbiome. In a previous study, microbiota highly associated with changes in cattle physiology, which included rectal temperature, drooling score and respiratory score, were identified under heat stress conditions. In the present study, genes differentially expressed between individuals were selected representing different additive genetic effects toward the heat stress response in cattle in their production condition. Moreover, a correlation network analysis was performed to identify interactions between the transcriptome and microbiome for 71 Chinese Holstein cows sequenced for mRNA from blood samples and for 16S rRNA genes from fecal samples. Bioinformatics analysis was performed comprising: i) clustering and classification of 16S rRNA sequence reads, ii) mapping cows' transcripts to the reference genome and their expression quantification, and iii) statistical analysis of both data types-including differential gene expression analysis and gene set enrichment analysis. A weighted co-expression network analysis was carried out to assess changes in the association between gene expression and microbiota abundance as well as to find hub genes/microbiota responsible for the regulation of gene expression under heat stress. Results showed 1,851 differentially expressed genes were found that were shared by three heat stress phenotypes. These genes were predominantly associated with the cytokine-cytokine receptor interaction pathway. The interaction analysis revealed three modules of genes and microbiota associated with rectal temperature with which two hubs of those modules were bacterial species, demonstrating the importance of the microbiome in the regulation of gene expression during heat stress. Genes and microbiota from the significant modules can be used as biomarkers of heat stress in cattle.

12.
Funct Integr Genomics ; 23(1): 19, 2022 Dec 23.
Artigo em Inglês | MEDLINE | ID: mdl-36564645

RESUMO

Since copy number variants (CNVs) have been recognized as an important source of genetic and transcriptomic variation, we aimed to characterize the impact of CNVs located within coding, intergenic, upstream, and downstream gene regions on the expression of transcripts. Regions in which deletions occurred most often were introns, while duplications in coding regions. The transcript expression was lower for deleted coding (P = 0.008) and intronic regions (P = 1.355 × 10-10), but it was not changed in the case of upstream and downstream gene regions (P = 0.085). Moreover, the expression was decreased if duplication occurred in the coding region (P = 8.318 × 10-5). Furthermore, a negative correlation (r = - 0.27) between transcript length and its expression was observed. The correlation between the percent of deleted/duplicated transcript and transcript expression level was not significant for all concerned genomic regions in five out of six animals. The exceptions were deletions in coding regions (P = 0.004) and duplications in introns (P = 0.01) in one individual. CNVs in coding (deletions, duplications) and intronic (deletions) regions are important modulators of transcripts by reducing their expression level. We hypothesize that deletions imply severe consequences by interrupting genes. The negative correlation between the size of the transcript and its expression level found in this study is consistent with the hypothesis that selection favours shorter introns and a moderate number of exons in highly expressed genes. This may explain the transcript expression reduction by duplications. We did not find the correlation between the size of deletions/duplications and transcript expression level suggesting that expression is modulated by CNVs regardless of their size.


Assuntos
Variações do Número de Cópias de DNA , Genoma , Animais , Genômica , Íntrons , Éxons
13.
Int J Mol Sci ; 23(15)2022 Aug 04.
Artigo em Inglês | MEDLINE | ID: mdl-35955824

RESUMO

Background: Severe outcomes of COVID-19 account for up to 15% of all cases. The study aims to check if any gene variants related to cardiovascular (CVD) and pulmonary diseases (PD) are correlated with a severe outcome of COVID-19 in a Polish cohort of COVID-19 patients. Methods: In this study, a subset of 747 samples from unrelated individuals collected across Poland in 2020 and 2021 was used and whole-genome sequencing was performed. Results: The GWAS analysis of SNPs and short indels located in genes related to CVD identified one variant significant in COVID-19 severe outcome in the HADHA gene, while for the PD gene panel, we found two significant variants in the DRC1 gene. In this study, both potentially protective and risk variants were identified, of which variants in the HADHA gene deserve the most attention. Conclusions: This is the first study reporting the association between the HADHA and DRC1 genetic variants and COVID-19 severe outcome based on the cohort WGS analysis. Although all the identified variants are localised in introns, they may be correlated and therefore inherited along with other risk variants, potentially causative to severe outcome of COVID-19 but not discovered yet.


Assuntos
COVID-19 , Doenças Cardiovasculares , COVID-19/genética , Doenças Cardiovasculares/genética , Estudo de Associação Genômica Ampla , Humanos , Mutação INDEL , Pulmão , Polimorfismo de Nucleotídeo Único
14.
BMC Microbiol ; 22(1): 171, 2022 07 05.
Artigo em Inglês | MEDLINE | ID: mdl-35790909

RESUMO

BACKGROUND: Humans have been influencing climate changes by burning fossil fuels, farming livestock, and cutting down rainforests, which has led to global temperature rise. This problem of global warming affects animals by causing heat stress, which negatively affects their health, biological functions, and reproduction. On the molecular level, it has been proved that heat stress changes the expression level of genes and therefore causes changes in proteome and metabolome. The importance of a microbiome in many studies showed that it is considered as individuals' "second genome". Physiological changes caused by heat stress may impact the microbiome composition. RESULTS: In this study, we identified fecal microbiota associated with heat stress that was quantified by three metrics - rectal temperature, drooling, and respiratory scores represented by their Estimated Breeding Values. We analyzed the microbiota from 136 fecal samples of Chinese Holstein cows through a 16S rRNA gene sequencing approach. Statistical modeling was performed using a negative binomial regression. The analysis revealed the total number of 24 genera and 12 phyla associated with heat stress metrics. Rhizobium and Pseudobutyrivibrio turned out to be the most significant genera, while Acidobacteria and Gemmatimonadetes were the most significant phyla. Phylogenetic analysis revealed that three heat stress indicators quantify different metabolic ways of animals' reaction to heat stress. Other studies already identified that those genera had significantly increased abundance in mice exposed to stressor-induced changes. CONCLUSIONS: This study provides insights into the analysis of microbiome composition in cattle using heat stress measured as a continuous variable. The bacteria highly associated with heat stress were highlighted and can be used as biomarkers in further microbiological studies.


Assuntos
Biodiversidade , Microbiota , Animais , Bovinos , Feminino , Resposta ao Choque Térmico , Camundongos , Filogenia , RNA Ribossômico 16S/genética , Temperatura
15.
Int J Mol Sci ; 23(11)2022 Jun 03.
Artigo em Inglês | MEDLINE | ID: mdl-35682950

RESUMO

COVID-19 infections pose a serious global health concern so it is crucial to identify the biomarkers for the susceptibility to and resistance against this disease that could help in a rapid risk assessment and reliable decisions being made on patients' treatment and their potential hospitalisation. Several studies investigated the factors associated with severe COVID-19 outcomes that can be either environmental, population based, or genetic. It was demonstrated that the genetics of the host plays an important role in the various immune responses and, therefore, there are different clinical presentations of COVID-19 infection. In this study, we aimed to use variant descriptive statistics from GWAS (Genome-Wide Association Study) and variant genomic annotations to identify metabolic pathways that are associated with a severe COVID-19 infection as well as pathways related to resistance to COVID-19. For this purpose, we applied a custom-designed mixed linear model implemented into custom-written software. Our analysis of more than 12.5 million SNPs did not indicate any pathway that was significant for a severe COVID-19 infection. However, the Allograft rejection pathway (hsa05330) was significant (p = 0.01087) for resistance to the infection. The majority of the 27 SNP marking genes constituting the Allograft rejection pathway were located on chromosome 6 (19 SNPs) and the remainder were mapped to chromosomes 2, 3, 10, 12, 20, and X. This pathway comprises several immune system components crucial for the self versus non-self recognition, but also the components of antiviral immunity. Our study demonstrated that not only single variants are important for resistance to COVID-19, but also the cumulative impact of several SNPs within the same pathway matters.


Assuntos
COVID-19 , Estudo de Associação Genômica Ampla , Aloenxertos , COVID-19/genética , Predisposição Genética para Doença , Humanos , Imunidade Inata , Polimorfismo de Nucleotídeo Único
16.
Animals (Basel) ; 12(9)2022 Apr 27.
Artigo em Inglês | MEDLINE | ID: mdl-35565554

RESUMO

The goal of our study was to identify the SNPs, metabolic pathways (KEGG), and gene ontology (GO) terms significantly associated with calving and workability traits in dairy cattle. We analysed direct (DCE) and maternal (MCE) calving ease, direct (DSB) and maternal (MSB) stillbirth, milking speed (MSP), and temperament (TEM) based on a Holstein-Friesian dairy cattle population consisting of 35,203 individuals. The number of animals, depending on the trait, ranged from 22,301 bulls for TEM to 30,603 for DCE. We estimated the SNP effects (based on 46,216 polymorphisms from Illumina BovineSNP50 BeadChip Version 2) using a multi-SNP mixed model. The SNP positions were mapped to genes and the GO terms/KEGG pathways of the corresponding genes were assigned. The estimation of the GO term/KEGG pathway effects was based on a mixed model using the SNP effects as dependent variables. The number of significant SNPs comprised 59 for DCE, 25 for DSB and MSP, 17 for MCE and MSB, and 7 for TEM. Significant KEGG pathways were found for MSB (2), TEM (2), and MSP (1) and 11 GO terms were significant for MSP, 10 for DCE, 8 for DSB and TEM, 5 for MCE, and 3 for MSB. From the perspective of a better understanding of the genomic background of the phenotypes, traits with low heritabilities suggest that the focus should be moved from single genes to the metabolic pathways or gene ontologies significant for the phenotype.

17.
Int J Mol Sci ; 23(9)2022 Apr 20.
Artigo em Inglês | MEDLINE | ID: mdl-35562925

RESUMO

Although Slavic populations account for over 4.5% of world inhabitants, no centralised, open-source reference database of genetic variation of any Slavic population exists to date. Such data are crucial for clinical genetics, biomedical research, as well as archeological and historical studies. The Polish population, which is homogenous and sedentary in its nature but influenced by many migrations of the past, is unique and could serve as a genetic reference for the Slavic nations. In this study, we analysed whole genomes of 1222 Poles to identify and genotype a wide spectrum of genomic variation, such as small and structural variants, runs of homozygosity, mitochondrial haplogroups, and de novo variants. Common variant analyses showed that the Polish cohort is highly homogenous and shares ancestry with other European populations. In rare variant analyses, we identified 32 autosomal-recessive genes with significantly different frequencies of pathogenic alleles in the Polish population as compared to the non-Finish Europeans, including C2, TGM5, NUP93, C19orf12, and PROP1. The allele frequencies for small and structural variants, calculated for 1076 unrelated individuals, are released publicly as The Thousand Polish Genomes database, and will contribute to the worldwide genomic resources available to researchers and clinicians.


Assuntos
Genética Populacional , Genoma Humano , Alelos , Frequência do Gene , Humanos , Proteínas Mitocondriais , Polônia
18.
Sci Rep ; 12(1): 7671, 2022 05 10.
Artigo em Inglês | MEDLINE | ID: mdl-35538164

RESUMO

Since global temperature is expected to rise by 2 °C in 2050 heat stress may become the most severe environmental factor. In the study, we illustrate the application of mixed linear models for the analysis of whole transcriptome expression in livers and adrenal tissues of Sprague-Dawley rats obtained by a heat stress experiment. By applying those models, we considered four sources of variation in transcript expression, comprising transcripts (1), genes (2), Gene Ontology terms (3), and Reactome pathways (4) and focussed on accounting for the similarity within each source, which was expressed as a covariance matrix. Models based on transcripts or genes levels explained a larger proportion of log2 fold change than models fitting the functional components of Gene Ontology terms or Reactome pathways. In the liver, among the most significant genes were PNKD and TRIP12. In the adrenal tissue, one transcript of the SUCO gene was expressed more strongly in the control group than in the heat-stress group. PLEC had two transcripts, which were significantly overexpressed in the heat-stress group. PER3 was significant only on gene level. Moving to the functional scale, five Gene Ontologies and one Reactome pathway were significant in the liver. They can be grouped into ontologies related to DNA repair, histone ubiquitination, the regulation of embryonic development and cytoplasmic translation. Linear mixed models are valuable tools for the analysis of high-throughput biological data. Their main advantages are the possibility to incorporate information on covariance between observations and circumventing the problem of multiple testing.


Assuntos
Perfilação da Expressão Gênica , Transtornos de Estresse por Calor , Animais , Biodiversidade , Resposta ao Choque Térmico/genética , Modelos Lineares , Ratos , Ratos Sprague-Dawley , Temperatura , Transcriptoma
19.
J Appl Genet ; 63(3): 527-533, 2022 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-35590085

RESUMO

Copy number variants (CNVs) may cover up to 12% of the whole genome and have substantial impact on phenotypes. We used 5867 duplications and 33,181 deletions available from the 1000 Genomes Project to characterise genomic regions vulnerable to CNV formation and to identify sequence features characteristic for those regions. The GC content for deletions was lower and for duplications was higher than for randomly selected regions. In regions flanking deletions and downstream of duplications, content was higher than in the random sequences, but upstream of duplication content was lower. In duplications and downstream of deletion regions, the percentage of low-complexity sequences was not different from the randomised data. In deletions and upstream of CNVs, it was higher, while for downstream of duplications, it was lower as compared to random sequences. The majority of CNVs intersected with genic regions - mainly with introns. GC content may be associated with CNV formation and CNVs, especially duplications are initiated in low-complexity regions. Moreover, CNVs located or overlapped with introns indicate their role in shaping intron variability. Genic CNV regions were enriched in many essential biological processes such as cell adhesion, synaptic transmission, transport, cytoskeleton organization, immune response and metabolic mechanisms, which indicates that these large-scaled variants play important biological roles.


Assuntos
Variações do Número de Cópias de DNA , Estudo de Associação Genômica Ampla , Sequência de Bases , Variações do Número de Cópias de DNA/genética , Genoma , Genômica , Humanos
20.
Poult Sci ; 100(11): 101433, 2021 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-34551372

RESUMO

Our study aimed to identify single nucleotide polymorphisms (SNPs) with a significant impact on the innate immunity represented by antibody response against lipopolysaccharide (LPS) and lipoteichoid acid (LTA) and the adaptive immune response represented toward keyhole limpet hemocyanin (KLH) using the SNP prioritization method. Data set consisted of 288 F2 experimental individuals, created by crossing Green-legged Partridgelike and White Leghorn. The analyzed SNPs were located within 24 short genomic regions of GGA1, GGA2, GGA3, GGA4, GGA9, GGA10, GGA14, GGA18, and GGZ, pre-targeted based on literature references and database information. For the specific antibody response toward KLH at d 0 the most highly prioritized SNP for additive and dominance effects were located on GGA2 in the 3'UTR of MYD88. For the response at d 7, the most highly prioritized SNP pointed at the 3'UTR of MYD88, but potential causal additive variants were located within ADIPOQ and one in PROCR. The highest priority for additive and dominance effects in the antibody response toward lipoteichoic acid at d 0 was attributed to the same SNP, located on GGA2 in the 3'UTR region of MYD88. Two SNPs among the top-10 for additive effect were located in the exon of NOCT. SNPs selected for their additive effect on antibody response toward lipopolysaccharide at d 0 marked 3 genes - NOCT, MYD88, and SNX8, while SNPs selected for their dominance effect marked - NOCT, ADIPOQ, and MYD88. The top-10 variants identified in our study were located in different functional parts of the genome. In the context of causality three groups can be distinguished: variants located in exons of protein coding genes (ADIPOQ, NOCT, PROCR, SNX8), variants within exons of non-coding transcripts, and variants located in genes' UTR regions. Variants from the first group influence protein structure and variants from both latter groups' exhibit regulatory roles on DNA (UTR) or RNA (lncRNA).


Assuntos
Galinhas , Imunidade Humoral , Imunidade Adaptativa , Animais , Formação de Anticorpos , Galinhas/genética , Imunidade Humoral/genética , Polimorfismo de Nucleotídeo Único
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...