ABSTRACT
Objective Based on the gene expression omnibus(GEO)database,bioinformatics methods were employed to analyze the expression characteristics of hypoxia-related differentially expressed genes(HRDEGs)in ischemic stroke,and key genes were screened,to provide important support for a deeper understanding of ischemic stroke.Methods The GSE16561 and GSE58294 datasets were downloaded from the GEO database,and Python software was used for data integration.The Combat method was employed to eliminate batch effects while retaining disease grouping characteristics.Principal component analysis was conducted to reduce dimensionality of the data before and after batch effect removal,and intraclass correlation coefficient(ICC)testing was performed on the ischemic stroke and normal control groups.Gene set enrichment analysis(GSEA)and single-sample GSEA were conducted on the merged and batch effects eliminated dataset,with a nominal P-value(NOM P-val)<0.05 and false discovery rate P-value(FDR P-val)<0.25 used as criteria to select significantly different gene sets.Differential expression genes between the ischemic stroke samples and normal control samples after merging and eliminating batch effects of the GSE16561 and GSE58294 datasets were identified using R software,with an absolute value of log2 gene expression fold change(FC)≥0.58 and adjusted P-value(Padj)<0.05 as selection criteria.Intersection with hypoxia-related genes obtained from the National Center for Biotechnology Information(NCBI)in the United States yielded the HRDEGs.Gene ontology(GO)and Kyoto encyclopedia of genes and genomes(KEGG)enrichment analyses were performed on the HRDEGs,and the STRING database was used to construct a protein-protein interaction network of differentially expressed genes.The top 10 key genes were filtered using Cytoscape 3.8 software.Results The ICC analysis results showed excellent consistency in the ischemic stroke and normal control samples after batch effect removal,with ICC values of 0.94 and 0.98 for the GSE16561 and GSE58294datasets,respectively.GSEA results demonstrated significant enrichment of 34 gene sets in the stroke samples in the newly merged and batch effects removed dataset from GSE16561 and GSE58294,leading to the identification of 404 differentially expressed genes(all with Padj<0.05),including 354 upregulated genes and 50 downregulated genes.Intersection with hypoxia-related genes yielded 64 HRDEGs.GO enrichment analysis indicated significant enrichment of HRDEGs in vesicle lumen,cytoplasmic vesicle lumen,secretory granule lumen,with molecular functions such as amide binding,peptide binding,phospholipid binding,and enzyme inhibitor activity.These genes are primarily involved in the positive regulation of cytokine production,regulation of immune response,response to bacterium-derived molecules,and response to lipopolysaccharide,among other biological processes.KEGG enrichment analysis revealed enrichment of HRDEGs in pathways related to lipid and atherosclerosis,Salmonella infection,neutrophil extracellular trap formation,nucleotide-binding oligomerization domain-like receptor signaling pathway,protein glycosylation in cancer,tuberculosis,and necroptosis.Based on the protein-protein interaction network,10 key genes were identified,including arginase1(ARG1),caspase1(CASP1),interleukin1 receptor type 1(IL-1R1),integrin subunit alpha M(ITGAM),matrix metalloproteinase9(MMP9),prostaglandin-endoperoxide synthase 2(PTGS2),signal transducer and activator of transcription 3(STAT3),Toll-like receptor2(TLR2),TLR4,and TLR8.Conclusion This study has identified 10 key genes associated with ischemic stroke and hypoxia through bioinformatics mining,which maybe provid potential targets for subsequent research and diagnostic and therapeutic interventions.