Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 34
Filter
1.
Eur J Sport Sci ; 23(10): 2098-2108, 2023 Oct.
Article in English | MEDLINE | ID: mdl-36680346

ABSTRACT

We developed a Biomedical Knowledge Graph model that is phenotype and biological function-aware through integrating knowledge from multiple domains in a Neo4j, graph database. All known human genes were assessed through the model to identify potential new risk genes for anterior cruciate ligament (ACL) ruptures and Achilles tendinopathy (AT). Genes were prioritised and explored in a case-control study comparing participants with ACL ruptures (ACL-R), including a sub-group with non-contact mechanism injuries (ACL-NON), to uninjured control individuals (CON). After gene filtering, 3376 genes, including 411 genes identified through previous whole exome sequencing, were found to be potentially linked to AT and ACL ruptures. Four variants were prioritised: HSPG2:rs2291826A/G, HSPG2:rs2291827G/A, ITGB2:rs2230528C/T and FGF9:rs2274296C/T. The rs2230528 CC genotype was over-represented in the CON group compared to ACL-R (p < 0.001) and ACL-NON (p < 0.001) and the TT genotype and T allele were over-represented in the ACL-R group and ACL-NON compared to CON (p < 0.001) group. Several significant differences in distributions were noted for the gene-gene interactions: (HSPG2:rs2291826, rs2291827 and ITGB2:rs2230528) and (ITGB2:rs2230528 and FGF9:rs2297429). This study substantiates the efficiency of using a prior knowledge-driven in silico approach to identify candidate genes linked to tendon and ACL injuries. Our biomedical knowledge graph identified and, with further testing, highlighted novel associations of the ITGB2 gene which has not been explored in a genetic case control association study, with ACL rupture risk. We thus recommend a multistep approach including bioinformatics in conjunction with next generation sequencing technology to improve the discovery potential of genomics technologies in musculoskeletal soft tissue injuries.HighlightsA biomedical knowledge graph was modelled for musculoskeletal soft tissue injuries to efficiently identify candidate genes for genetic susceptibility analyses.The biomedical knowledge graph and sequencing data identified potential biologically relevant variants to explore susceptibility to common tendon and ligament injuries. Specifically genetic variants within the ITGB2 and FGF9 genes were associated with ACL risk.Novel allele combinations (HSPG2-ITGB2 and ITGB2-FGF9) showcase the potential effect of ITGB2 in influencing risk of ACL rupture.


Subject(s)
Achilles Tendon , Anterior Cruciate Ligament Injuries , Tendinopathy , Humans , Anterior Cruciate Ligament Injuries/genetics , Anterior Cruciate Ligament , Genetic Predisposition to Disease , Case-Control Studies , Tendinopathy/genetics , Genetic Loci , Rupture/genetics , Fibroblast Growth Factor 9/genetics
2.
Leuk Lymphoma ; 63(8): 1897-1906, 2022 08.
Article in English | MEDLINE | ID: mdl-35249471

ABSTRACT

Chromosomal translocations and gene mutations are characteristics of the genomic profile of acute myeloid leukemia (AML). We aim to identify a gene signature associated with poor prognosis in AML patients with FLT3-ITD compared to AML patients with NPM1/CEBPA mutations. RNA-sequencing (RNA-Seq) count data were downloaded from the UCSC Xena browser. Samples were grouped by their mutation status into high and low-risk groups. Differential gene expression (DGE), machine learning (ML) and survival analyses were performed. A total of 471 differentially expressed genes (DEGs) were identified, of which 16 DEGs were used as features for the prediction of mutation status. An accuracy of 92% was obtained from the ML model. FHL1, SPNS3, and MPZL2 were found to be associated with overall survival in FLT3-ITD samples. FLT3-ITD mutation confers an indicative gene expression profile different from NPM1/CEBPA mutation, and the expression of FHL1, SPSN3, and MPZL2 can serve as prognostic indicators of unfavorable disease.


Subject(s)
Leukemia, Myeloid, Acute , Nuclear Proteins , Child , Humans , Cell Adhesion Molecules/genetics , fms-Like Tyrosine Kinase 3/genetics , Intracellular Signaling Peptides and Proteins , Leukemia, Myeloid, Acute/diagnosis , Leukemia, Myeloid, Acute/genetics , LIM Domain Proteins/genetics , Muscle Proteins/genetics , Mutation , Nuclear Proteins/genetics , Nucleophosmin , Prognosis , Up-Regulation
3.
Future Oncol ; 17(34): 4769-4783, 2021 Dec.
Article in English | MEDLINE | ID: mdl-34751044

ABSTRACT

Background: Neuroblastoma is the most common extracranial solid tumor in childhood. Amplification of MYCN in neuroblastoma is a predictor of poor prognosis. Materials and methods: DNA methylation data from the TARGET data matrix were stratified into MYCN amplified and non-amplified groups. Differential methylation analysis, clustering, recursive feature elimination (RFE), machine learning (ML), Cox regression analysis and Kaplan-Meier estimates were performed. Results and Conclusion: 663 CpGs were differentially methylated between the two groups. A total of 25 CpGs were selected by RFE for clustering and ML, and a 100% clustering accuracy was obtained. ML validation on three external datasets produced high accuracy scores of 100%, 97% and 93%. Eight survival-associated CpGs were also identified. Therapeutic interventions may need to be targeted to patient subgroups.


Lay abstract Neuroblastoma is the most common extracranial solid tumor in childhood. Elevated levels of the MYCN protein in neuroblastoma is a predictor of poor prognosis. It is the most relevant prognostic factor in neuroblastoma and predicting MYCN gene amplification (which leads to increased gene expression and more protein) from epigenetic data rather than genetic testing might be useful in the oncology clinic. This study was designed to identify a DNA methylation (epigenetic) signature that can be used to diagnose MYCN amplification without actually testing for the gene. The authors also aimed to correlate this DNA methylation signature with patient survival and poorer prognosis. Based on statistical and computational methods applied to DNA methylation data for neuroblastoma, signatures that are predictive of MYCN amplification and poor prognosis were found, which clinicians can use for early patient diagnosis and selection of the best therapies for patients at high risk.


Subject(s)
Biomarkers, Tumor/genetics , DNA Methylation , Epigenesis, Genetic , N-Myc Proto-Oncogene Protein/genetics , Neuroblastoma/mortality , Child , CpG Islands/genetics , Datasets as Topic , Gene Amplification , Gene Expression Regulation, Neoplastic , Humans , Kaplan-Meier Estimate , Machine Learning , Neuroblastoma/genetics , Prognosis , Progression-Free Survival , Risk Assessment/methods
4.
Oncotarget ; 11(46): 4293-4305, 2020 Nov 17.
Article in English | MEDLINE | ID: mdl-33245713

ABSTRACT

Neuroblastoma is the most common extracranial solid tumor in childhood. Patients in high-risk group often have poor outcomes with low survival rates despite several treatment options. This study aimed to identify a genetic signature from gene expression profiles that can serve as prognostic indicators of survival time in patients of high-risk neuroblastoma, and that could be potential therapeutic targets. RNA-seq count data was downloaded from UCSC Xena browser and samples grouped into Short Survival (SS) and Long Survival (LS) groups. Differential gene expression (DGE) analysis, enrichment analyses, regulatory network analysis and machine learning (ML) prediction of survival group were performed. Forty differentially expressed genes (DEGs) were identified including genes involved in molecular function activities essential for tumor proliferation. DEGs used as features for prediction of survival groups included EVX2, NHLH2, PRSS12, POU6F2, HOXD10, MAPK15, RTL1, LGR5, CYP17A1, OR10AB1P, MYH14, LRRTM3, GRIN3A, HS3ST5, CRYAB and NXPH3. An accuracy score of 82% was obtained by the ML classification models. SMIM28 was revealed to possibly have a role in tumor proliferation and aggressiveness. Our results indicate that these DEGs can serve as prognostic indicators of survival in high-risk neuroblastoma patients and will assist clinicians in making better therapeutic and patient management decisions.

5.
J Orthop Res ; 38(8): 1856-1865, 2020 08.
Article in English | MEDLINE | ID: mdl-31922278

ABSTRACT

Variants within genes encoding structural and regulatory elements of ligaments have been associated with musculoskeletal soft tissue injury risk. The role of intron 4-exon 5 variants within the α1 chain of type V collagen (COL5A1) gene and genes of the transforming growth factor-ß (TGF-ß) family, TGFBR3 and TGFBI, was investigated on the risk of anterior cruciate ligament (ACL) ruptures. A case-control genetic association study was performed on 210 control (CON) and 249 participants with surgically diagnosed ruptures (ACL), of which 147 reported a noncontact mechanism of injury (NON). Whole-exome sequencing data were used to prioritize variants of potential functional relevance. Genotyping for COL5A1 (rs3922912 G>A, rs4841926 C>T, and rs3124299 C>T), TGFBR3 (rs1805113 G>A and rs1805117 T>C), and TGFBI (rs1442 G>C) was performed using Taqman SNP genotyping assays. Significant overrepresentation of the G allele of TGFBR3 rs1805113 was observed in CON vs ACL (P = .014) and NON groups (P = .021). Similar results were obtained in a female with the G allele (CON vs ACL: P = .029; CON vs NON: P = .016). The TGFBI rs1442 CC genotype was overrepresented in the female ACL vs CON (P = .013). Associations of inferred allele combinations were observed in line with the above results. COL5A1 intron 4-exon 5 genomic interval was not associated with the risk of ACL ruptures. Instead, this novel study is the first to use this approach to identify variants within the TGF-ß signaling pathway to be implicated in the risk of ACL ruptures. A genetic susceptibility interval was identified to be explored in the context of extracellular matrix remodeling.


Subject(s)
Anterior Cruciate Ligament Injuries/genetics , Collagen Type V/genetics , Extracellular Matrix Proteins/genetics , Proteoglycans/genetics , Receptors, Transforming Growth Factor beta/genetics , Transforming Growth Factor beta/genetics , Adolescent , Adult , Female , Gene Frequency , Genetic Predisposition to Disease , Haplotypes , Humans , Male , Young Adult
6.
BMC Med Genomics ; 12(Suppl 2): 46, 2019 03 13.
Article in English | MEDLINE | ID: mdl-30871540

ABSTRACT

BACKGROUND: Fat mass and obesity-associated (FTO) gene has been under close investigation since the discovery of its high impact on the obesity status in 2007 by a range of publications. Recent report on its implication in adipocytes underscored its molecular and functional mechanics in pathology. Still, the population specific features of the locus structure have not been approached in detail. METHODS: We analyzed the population specific haplotype profiles of FTO genomic locus identified by Genome Wide Association Studies (GWAS) for the high obesity risk by examining eighteen 1000G populations from 4 continental groups. The GWAS SNPs cluster is located in the FTO gene intron 1 spanning around 70 kb. RESULTS: We reconstructed the ancestral state of the locus, which comprised low-risk major allele found in all populations, and two minor risk-associated alleles, each one specific for African and European populations, correspondingly. The locus structure and its allele frequency distribution underscore the high risk allele frequency specifically for the European population. South Asian populations have the second highest frequency of risk alleles, while East Asian populations have the lowest. African population-specific minor allele was only partially risk-associated. All of the GWAS SNPs considered are manifested by low risk alleles as reference (major) ones (p > 0.5) in each of the continental groups. Strikingly, rs1421085, recently reported as a causal SNP, was found to be monomorphic in ancestral (African) populations, implying possible selection sweep in the course of its rapid fixation, as reported previously. CONCLUSION: The observations underscore varying FTO -linked risk in the manifestation of population specific epidemiology of genetically bound obesity. The results imply that the FTO locus is one of the major genetic determinants for obesity risk from GWAS SNPs set.


Subject(s)
Alpha-Ketoglutarate-Dependent Dioxygenase FTO/genetics , Obesity/pathology , White People/genetics , Alleles , Gene Frequency , Genetics, Population , Genome-Wide Association Study , Haplotypes , Humans , Introns , Obesity/genetics , Polymorphism, Single Nucleotide , Principal Component Analysis , Risk Factors
7.
PLoS One ; 13(10): e0205860, 2018.
Article in English | MEDLINE | ID: mdl-30359423

ABSTRACT

Musculoskeletal soft tissue injuries are complex phenotypes with genetics being one of many proposed risk factors. Case-control association studies using the candidate gene approach have predominately been used to identify risk loci for these injuries. However, the ability to identify all risk conferring variants using this approach alone is unlikely. Therefore, this study aimed to further define the genetic profile of these injuries using an integrated omics approach involving whole exome sequencing and a customised analyses pipeline. The exomes of ten exemplar asymptomatic controls and ten exemplar cases with Achilles tendinopathy were individually sequenced using a platform that included the coverage of the untranslated regions and miRBase miRNA genes. Approximately 200 000 variants were identified in the sequenced samples. Previous research was used to guide a targeted analysis of the genes encoding the tenascin-C (TNC) glycoprotein and the α1 chain of type XXVII collagen (COL27A1) located on chromosome 9. Selection of variants within these genes were; however, not predetermined but based on a tiered filtering strategy. Four variants in TNC (rs1061494, rs1138545, rs2104772 and rs1061495) and three variants in the upstream COL27A1 gene (rs2567706, rs2241671 and rs2567705) were genotyped in larger Achilles tendinopathy and anterior cruciate ligament (ACL) rupture sample groups. The CC genotype of TNC rs1061494 (C/T) was associated with the risk of Achilles tendinopathy (p = 0.018, OR: 2.5 95% CI: 1.2-5.1). Furthermore, the AA genotype of the TNC rs2104772 (A/T) variant was significantly associated with ACL ruptures in the female subgroup (p = 0.035, OR: 2.3 95% CI: 1.1-5.5). An inferred haplotype in the TNC gene was also associated with the risk of Achilles tendinopathy. These results provide a proof of concept for the use of a customised pipeline for the exploration of a larger genomic dataset. This approach, using previous research to guide a targeted analysis of the data has generated new genetic signatures in the biology of musculoskeletal soft tissue injuries.


Subject(s)
Achilles Tendon/pathology , Anterior Cruciate Ligament Injuries/genetics , Exome , Fibrillar Collagens/genetics , Tenascin/genetics , Tendinopathy/genetics , Adult , Alleles , Anterior Cruciate Ligament/pathology , Anterior Cruciate Ligament Injuries/pathology , Case-Control Studies , Female , Fibrillar Collagens/blood , Genetic Predisposition to Disease , Genotype , Haplotypes , Humans , Male , Middle Aged , Phenotype , Risk , Rupture/pathology , South Africa , Tenascin/blood , Tendinopathy/pathology , Exome Sequencing
8.
BMC Med Genet ; 19(1): 95, 2018 06 07.
Article in English | MEDLINE | ID: mdl-29879922

ABSTRACT

BACKGROUND: We investigated a South African family of admixed ancestry in which the first generation (G1) developed insidious progressive distal to proximal weakness in their twenties, while their offspring (G2) experienced severe unexpected symptoms of myalgia and cramps since adolescence. Our aim was to identify deleterious mutations that segregate with the affected individuals in this family. METHODS: Exome sequencing was performed on five cases, which included three affected G1 siblings and two pauci-symptomatic G2 offspring. As controls we included an unaffected G1 sibling and a spouse of one of the G1 affected individuals. Homozygous or potentially compound heterozygous variants that were predicted to be functional and segregated with the affected G1 siblings, were further evaluated. Additionally, we considered variants in all genes segregating exclusively with the affected (G1) and pauci-symptomatic (G2) individuals to address the possibility of a pseudo-autosomal dominant inheritance pattern in this family. RESULTS: All affected G1 individuals were homozygous for a novel truncating p.Tyr1433Ter DYSF (dysferlin) mutation, with their asymptomatic sibling and both pauci-symptomatic G2 offspring carrying only a single mutant allele. Sanger sequencing confirmed segregation of the variant. No additional potentially contributing variant was found in the DYSF or any other relevant gene in the pauci-symptomatic carriers. CONCLUSION: Our finding of a truncating dysferlin mutation confirmed dysferlinopathy in this family and we propose that the single mutant allele is the primary contributor to the neuromuscular symptoms seen in the second-generation pauci-symptomatic carriers.


Subject(s)
Dysferlin/genetics , Exome/genetics , Muscular Dystrophies, Limb-Girdle/genetics , Muscular Dystrophies, Limb-Girdle/pathology , Mutation , Neuromuscular Diseases/genetics , Neuromuscular Diseases/pathology , Adolescent , Adult , Female , Follow-Up Studies , Heterozygote , Homozygote , Humans , Male , Middle Aged , Pedigree , Prognosis , Siblings , Exome Sequencing , Young Adult
10.
BMC Cancer ; 18(1): 377, 2018 04 03.
Article in English | MEDLINE | ID: mdl-29614978

ABSTRACT

BACKGROUND: Gene expression can be employed for the discovery of prognostic gene or multigene signatures cancer. In this study, we assessed the prognostic value of a 35-gene expression signature selected by pathway and machine learning based methods in adjuvant therapy-linked glioblastoma multiforme (GBM) patients from the Cancer Genome Atlas. METHODS: Genes with high expression variance was subjected to pathway enrichment analysis and those having roles in chemoradioresistance pathways were used in expression-based feature selection. A modified Support Vector Machine Recursive Feature Elimination algorithm was employed to select a subset of these genes that discriminated between rapidly-progressing and slowly-progressing patients. RESULTS: Survival analysis on TCGA samples not used in feature selection and samples from four GBM subclasses, as well as from an entirely independent study, showed that the 35-gene signature discriminated between the survival groups in all cases (p<0.05) and could accurately predict survival irrespective of the subtype. In a multivariate analysis, the signature predicted progression-free and overall survival independently of other factors considered. CONCLUSION: We propose that the performance of the signature makes it an attractive candidate for further studies to assess its utility as a clinical prognostic and predictive biomarker in GBM patients. Additionally, the signature genes may also be useful therapeutic targets to improve both progression-free and overall survival in GBM patients.


Subject(s)
Brain Neoplasms/genetics , Brain Neoplasms/pathology , Glioblastoma/genetics , Glioblastoma/pathology , Transcriptome , Biomarkers , Brain Neoplasms/mortality , Databases, Genetic , Disease Progression , Gene Expression Profiling , Gene Expression Regulation, Neoplastic , Glioblastoma/mortality , Glioblastoma/therapy , Humans , Prognosis , Signal Transduction , Survival Analysis
11.
Nat Commun ; 8(1): 2062, 2017 12 12.
Article in English | MEDLINE | ID: mdl-29233967

ABSTRACT

The Southern African Human Genome Programme is a national initiative that aspires to unlock the unique genetic character of southern African populations for a better understanding of human genetic diversity. In this pilot study the Southern African Human Genome Programme characterizes the genomes of 24 individuals (8 Coloured and 16 black southeastern Bantu-speakers) using deep whole-genome sequencing. A total of ~16 million unique variants are identified. Despite the shallow time depth since divergence between the two main southeastern Bantu-speaking groups (Nguni and Sotho-Tswana), principal component analysis and structure analysis reveal significant (p < 10-6) differentiation, and FST analysis identifies regions with high divergence. The Coloured individuals show evidence of varying proportions of admixture with Khoesan, Bantu-speakers, Europeans, and populations from the Indian sub-continent. Whole-genome sequencing data reveal extensive genomic diversity, increasing our understanding of the complex and region-specific history of African populations and highlighting its potential impact on biomedical research and genetic susceptibility to disease.


Subject(s)
Black People/genetics , Genetic Predisposition to Disease/genetics , Genetic Variation/genetics , Genome, Human , DNA Mutational Analysis/methods , Healthy Volunteers , Humans , Male , Mutation/genetics , Pilot Projects , Principal Component Analysis , South Africa
12.
Neuromuscul Disord ; 27(9): 816-825, 2017 Sep.
Article in English | MEDLINE | ID: mdl-28673556

ABSTRACT

Treatment-resistant ophthalmoplegia (OP-MG) is not uncommon in individuals with African genetic ancestry and myasthenia gravis (MG). To identify OP-MG susceptibility genes, extended whole exome sequencing was performed using extreme phenotype sampling (11 OP-MG vs 4 control-MG) all with acetylcholine receptor-antibody positive MG. This approach identified 356 variants that were twice as frequent in OP-MG compared to control-MG individuals. After performing probability test estimates and filtering variants according to those 'suggestive' of association with OP-MG (p < 0.05), only three variants remained which were expressed in extraocular muscles. Validation in 25 OP-MG and 50 control-MG cases supported the association of DDX17delG (p = 0.014) and SPTLC3insACAC (p = 0.055) with OP-MG, but ST8SIA1delCCC could not be verified by Sanger sequencing. A parallel approach, using a semantic model informed by current knowledge of MG-pathways, identified an African-specific interleukin-6 receptor (IL6R) variant, IL6R c.*3043 T>C, that was more frequent in OP-MG compared to control-MG cases (p = 0.069) and population controls (p = 0.043). A weighted genetic risk score, derived from the odds ratios of association of these variants with OP-MG, correlated with the OP-MG phenotype as opposed to control MG. This unbiased approach implicates several potentially functional gene variants in the gangliosphingolipid and myogenesis pathways in the development of the OP-MG subphenotype.


Subject(s)
DEAD-box RNA Helicases/genetics , Genetic Predisposition to Disease , Mutation/genetics , Myasthenia Gravis/genetics , Ophthalmoplegia/genetics , Serine C-Palmitoyltransferase/genetics , Adolescent , Child , Child, Preschool , Computational Biology , Computer Simulation , Female , Humans , Infant , Male , Myasthenia Gravis/complications , Ophthalmoplegia/complications , Phenotype , Receptors, Interleukin-6/genetics , Exome Sequencing
13.
Psychiatr Genet ; 27(4): 139-151, 2017 08.
Article in English | MEDLINE | ID: mdl-28574862

ABSTRACT

OBJECTIVES: Post-traumatic stress disorder is characterized by impaired fear extinction and excessive anxiety. D-Cycloserine (DCS) has previously been shown to facilitate fear extinction and decrease anxiety in animal and human studies. This study utilized a contextual fear-conditioning animal model to investigate the involvement of microRNAs (miRNAs) in fear extinction and the reduction of anxiety, as mediated by the co-administration of DCS and behavioural fear extinction. METHODS: Fear conditioning consisted of an electric foot shock; fear extinction consisted of behavioural fear extinction co-administered with either DCS or saline. The light/dark avoidance test was used to evaluate anxiety-related behaviour subsequent to fear conditioning and was used to evaluate anxiety-related behaviour following fear conditioning and to subsequently group animals into well-adapted and maladapted subgroups. These subgroups also showed significant differences in terms of fear extinction. Small RNAs extracted from the left dorsal hippocampus were sequenced using next-generation sequencing to identify differentially expressed miRNAs associated with DCS-induced fear extinction and reduction of anxiety. In-silico prediction analyses identified mRNA targets (from data of the same animals) of the differentially expressed miRNAs. Two of the predicted mRNA-miRNA interactions were functionally investigated. RESULTS: Overall, 32 miRNAs were differentially expressed between rats that were fear conditioned, received DCS and were well adapted and rats that were fear conditioned, received saline and were maladapted. Nineteen of these miRNAs were predicted to target and regulate the expression of 63 genes differentially expressed between fear-conditioned, DCS-administered, well-adapted and fear-conditioned, saline-administered, and maladapted groups (several of which are associated with neuronal inflammation, learning and memory). Functional luciferase assays indicated that rno-mir-31a-5p may have regulated the expression of interleukin 1 receptor antagonist (Il1rn) and metallothionein 1a (Mt1a). CONCLUSION: These differentially expressed miRNAs may be mediators of gene expression changes that facilitated decreased neuronal inflammation, optimum learning and memory and contributed towards effective fear extinction and reduction of anxiety following the co-administration of DCS and behavioural fear extinction.


Subject(s)
MicroRNAs/therapeutic use , Stress Disorders, Post-Traumatic/drug therapy , Stress Disorders, Post-Traumatic/genetics , Animals , Anxiety/genetics , Anxiety/physiopathology , Anxiety Disorders , Conditioning, Classical/physiology , Cycloserine/pharmacology , Cycloserine/therapeutic use , Disease Models, Animal , Extinction, Psychological/physiology , Fear/drug effects , Fear/physiology , High-Throughput Nucleotide Sequencing/methods , Hippocampus/physiopathology , Male , Memory , MicroRNAs/metabolism , Rats , Rats, Sprague-Dawley , Receptors, N-Methyl-D-Aspartate/metabolism
14.
Biotechniques ; 62(1): 18-30, 2017 01 01.
Article in English | MEDLINE | ID: mdl-28118812

ABSTRACT

Next-generation sequencing (NGS) of whole genomes and exomes is a powerful tool in biomedical research and clinical diagnostics. However, the vast amount of data produced by NGS introduces new challenges and opportunities, many of which require novel computational and theoretical approaches when it comes to identifying the causal variant(s) for a disease of interest. While workflows and associated software to process raw data and produce high-confidence variant calls have significantly improved, filtering tens of thousands of candidates to identify a subset relevant to a specific study is still a complex exercise best left to bioinformaticists. However, as this prioritization procedure requires biological/biomedical reasoning, biologists and clinicians are increasingly motivated to handle the task themselves. Here, we describe a set of guidelines, tools, and online resources that can be used to identify functional variants from whole-genome and whole-exome variant calls and then prioritize these variants with potential associations to phenotypes of interest. Insights gained from a recently published analysis of protein-coding gene variation in >60,000 humans by the Exome Aggregation Consortium (ExAC) are also taken into account.


Subject(s)
Databases, Genetic , Genetic Variation/genetics , Genome/genetics , Genomics/methods , High-Throughput Nucleotide Sequencing/methods , Sequence Analysis, DNA/methods , Algorithms , Animals , Exome/genetics , Humans , Software
15.
Int J Endocrinol ; 2016: 3172093, 2016.
Article in English | MEDLINE | ID: mdl-27555869

ABSTRACT

Aims. To conduct a genome-wide DNA methylation in individuals with type 2 diabetes, individuals with prediabetes, and control mixed ancestry individuals from South Africa. Methods. We used peripheral blood to perform genome-wide DNA methylation analysis in 3 individuals with screen detected diabetes, 3 individuals with prediabetes, and 3 individuals with normoglycaemia from the Bellville South Community, Cape Town, South Africa, who were age-, gender-, body mass index-, and duration of residency-matched. Methylated DNA immunoprecipitation (MeDIP) was performed by Arraystar Inc. (Rockville, MD, USA). Results. Hypermethylated DMRs were 1160 (81.97%) and 124 (43.20%), respectively, in individuals with diabetes and prediabetes when both were compared to subjects with normoglycaemia. Our data shows that genes related to the immune system, signal transduction, glucose transport, and pancreas development have altered DNA methylation in subjects with prediabetes and diabetes. Pathway analysis based on the functional analysis mapping of genes to KEGG pathways suggested that the linoleic acid metabolism and arachidonic acid metabolism pathways are hypomethylated in prediabetes and diabetes. Conclusions. Our study suggests that epigenetic changes are likely to be an early process that occurs before the onset of overt diabetes. Detailed analysis of DMRs that shows gradual methylation differences from control versus prediabetes to prediabetes versus diabetes in a larger sample size is required to confirm these findings.

16.
BMC Genomics ; 17: 561, 2016 08 08.
Article in English | MEDLINE | ID: mdl-27503259

ABSTRACT

BACKGROUND: Iron metabolism and regulation is an indispensable part of species survival, most importantly for blood feeding insects. Iron regulatory proteins are central regulators of iron homeostasis, whose binding to iron response element (IRE) stem-loop structures within the UTRs of genes regulate expression at the post-transcriptional level. Despite the extensive literature on the mechanism of iron regulation in human, less attention has been given to insect and more specifically the blood feeding insects, where research has mainly focused on the characterization of ferritin and transferrin. We thus, examined the mechanism of iron homeostasis through a genome-wide computational identification of IREs and other enriched motifs in the UTRs of Glossina morsitans with the view to identify new IRE-regulated genes. RESULTS: We identified 150 genes, of which two are known to contain IREs, namely the ferritin heavy chain and the MRCK-alpha. The remainder of the identified genes is considered novel including 20 hypothetical proteins, for which an iron-regulatory mechanism of action was inferred. Forty-three genes were found with IRE-signatures of regulation in two or more insects, while 46 were only found to be IRE-regulated in two species. Notably 39 % of the identified genes exclusively shared IRE-signatures in other Glossina species, which are potentially Glossina-specific adaptive measures in addressing its unique reproductive biology and blood meal-induced iron overload. In line with previous findings, we found no evidence pertaining to an IRE regulation of Transferrin, which highlight the importance of ferritin heavy chain and the other proposed transporters in the tsetse fly. In the context of iron-sequestration, key players of tsetse immune defence against trypanosomes have been introduced namely 14 stress and immune response genes, while 28 cell-envelop, transport, and binding genes were assigned a putative role in iron trafficking. Additionally, we identified and annotated enriched motifs in the UTRs of the putative IRE-regulated genes to derive at a co-regulatory network that maintains iron homeostasis in tsetse flies. Three putative microRNA-binding sites namely Gy-box, Brd-box and K-box motifs were identified among the regulatory motifs, enriched in the UTRs of the putative IRE-regulated genes. CONCLUSION: Beyond our current view of iron metabolism in insects, with ferritin and transferrin as its key players, this study provides a comprehensive catalogue of genes with possible roles in the acquisition; transport and storage of iron hence iron homeostasis in the tsetse fly.


Subject(s)
Iron/metabolism , Models, Biological , Response Elements , Tsetse Flies/genetics , Tsetse Flies/metabolism , Animals , Biological Transport , Disease Vectors , Genes, Insect , Iron-Regulatory Proteins/genetics , Iron-Regulatory Proteins/metabolism
17.
Dev Comp Immunol ; 65: 321-329, 2016 12.
Article in English | MEDLINE | ID: mdl-27497873

ABSTRACT

Although pulmonary epithelial cells are integral to innate and adaptive immune responses during Mycobacterium tuberculosis infection, global transcriptomic changes in these cells remain largely unknown. Changes in gene expression induced in pulmonary epithelial cells infected with M. tuberculosis F15/LAM4/KZN, F11, F28, Beijing and Unique genotypes were investigated by RNA sequencing (RNA-Seq). The Illumina HiSeq 2000 platform generated 50 bp reads that were mapped to the human genome (Hg19) using Tophat (2.0.10). Differential gene expression induced by the different strains in infected relative to the uninfected cells was quantified and compared using Cufflinks (2.1.0) and MeV (4.0.9), respectively. Gene expression varied among the strains with the total number of genes as follows: F15/LAM4/KZN (1187), Beijing (1252), F11 (1639), F28 (870), Unique (886) and H37Rv (1179). A subset of 292 genes was commonly induced by all strains, where 52 genes were down-regulated while 240 genes were up-regulated. Differentially expressed genes were compared among the strains and the number of induced strain-specific gene signatures were as follows: F15/LAM4/KZN (138), Beijing (52), F11 (255), F28 (55), Unique (186) and H37Rv (125). Strain-specific molecular gene signatures associated with functional pathways were observed only for the Unique and H37Rv strains while certain biological functions may be associated with other strain signatures. This study demonstrated that strains of M. tuberculosis induce differential gene expression and strain-specific molecular signatures in pulmonary epithelial cells. Specific signatures induced by clinical strains of M. tuberculosis can be further explored for novel host-associated biomarkers and adjunctive immunotherapies.


Subject(s)
Epithelial Cells/microbiology , Gene Expression Regulation , Lung/pathology , Mycobacterium tuberculosis/genetics , Tuberculosis/immunology , Adaptive Immunity , Cell Line , Epithelial Cells/immunology , Genotype , High-Throughput Nucleotide Sequencing , Humans , Mycobacterium tuberculosis/immunology , Species Specificity , Transcriptome , Tuberculosis/microbiology
18.
J Bioinform Comput Biol ; 14(5): 1650022, 2016 10.
Article in English | MEDLINE | ID: mdl-27411306

ABSTRACT

Microarray for transcriptomics experiments often suffer from limited statistical power due to small sample size. Quantile discretization (QD) maps expression values for a sample into a series of equivalently sized 'bins' that represent a discrete numerical range, e.g. [Formula: see text]4 to [Formula: see text]4, which enables normalized data from multiple experiments and/or expression platforms to be combined for re-analysis. We found, however, that informal selection of bin numbers often resulted in loss of the underlying correlation structure in the data through assigning of the same numerical value to genes that are in reality expressed at significantly different levels within a sample. Here we report a procedure for determining an optimal bin number for dataset. Applying this to integrated public breast cancer datasets enabled statistical identification of several differentially expressed tumorigenesis-related genes that were not found when analyzing the individual datasets, and also several cancer biomarkers not previously indicated as having utility in the disease. Notably, differential modulation of translational control and protein synthesis via multiple pathways were found to potentially have central roles in breast cancer development and progression. These findings suggest that our protocol has significant utility in making meaningful novel biomedical discoveries by leveraging the large public expression data repositories.


Subject(s)
Algorithms , Breast Neoplasms/genetics , Gene Expression Regulation, Neoplastic , Oligonucleotide Array Sequence Analysis/methods , Biomarkers, Tumor/genetics , Databases, Genetic , Female , Humans , Male , Models, Theoretical , Phenotype , Prostatic Neoplasms/genetics
19.
Source Code Biol Med ; 11: 10, 2016.
Article in English | MEDLINE | ID: mdl-27375772

ABSTRACT

BACKGROUND: Whole exome sequencing (WES) has provided a means for researchers to gain access to a highly enriched subset of the human genome in which to search for variants that are likely to be pathogenic and possibly provide important insights into disease mechanisms. In developing countries, bioinformatics capacity and expertise is severely limited and wet bench scientists are required to take on the challenging task of understanding and implementing the barrage of bioinformatics tools that are available to them. RESULTS: We designed a novel method for the filtration of WES data called TAPER™ (Tool for Automated selection and Prioritization for Efficient Retrieval of sequence variants). CONCLUSIONS: TAPER™ implements a set of logical steps by which to prioritize candidate variants that could be associated with disease and this is aimed for implementation in biomedical laboratories with limited bioinformatics capacity. TAPER™ is free, can be setup on a Windows operating system (from Windows 7 and above) and does not require any programming knowledge. In summary, we have developed a freely available tool that simplifies variant prioritization from WES data in order to facilitate discovery of disease-causing genes.

20.
Tuberculosis (Edinb) ; 97: 73-85, 2016 Mar.
Article in English | MEDLINE | ID: mdl-26980499

ABSTRACT

Limited knowledge exists on pathways, networks and transcriptional factors regulated within epithelial cells by diverse Mycobacterium tuberculosis genotypes. This study aimed to elucidate these mechanisms induced in A549 epithelial cells by dominant clinical strains in KwaZulu-Natal, South Africa. RNA for sequencing was extracted from epithelial cells at 48 h post-infection with 5 strains at a multiplicity of infection of approximately 10:1. Bioinformatics analysis performed with the RNA-Seq Tuxedo pipeline identified differentially expressed genes. Changes in pathways, networks and transcriptional factors were identified using Ingenuity Pathway Analysis (IPA). The interferon signalling and hepatic fibrosis/hepatic stellate cell activation pathways were among the top 5 canonical pathways in all strains. Hierarchical clustering for enrichment of cholesterol biosynthesis and immune associated pathways revealed similar patterns for Beijing and Unique; F15/LAM4/KZN and F11; and, F28 and H37Rv strains, respectively. However, the induction of top scoring networks varied among the strains. Among the transcriptional factors, only EHL, IRF7, PML, STAT1, STAT2 and VDR were induced by all clinical strains. Activation of the different pathways, networks and transcriptional factors revealed in the current study may be an underlying mechanism that results in the differential host response by clinical strains of M. tuberculosis.


Subject(s)
Epithelial Cells/microbiology , Mycobacterium tuberculosis/pathogenicity , Protein Interaction Maps , Pulmonary Alveoli/microbiology , Signal Transduction , Transcription Factors/metabolism , Cell Line, Tumor , Cluster Analysis , Computational Biology , Databases, Genetic , Epithelial Cells/metabolism , Gene Expression Regulation , Gene Regulatory Networks , Host-Pathogen Interactions , Humans , Pulmonary Alveoli/metabolism , Signal Transduction/genetics , Time Factors , Transcription Factors/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...