解剖学报 ›› 2024, Vol. 55 ›› Issue (3): 311-318.doi: 10.16098/j.issn.0529-1356.2024.03.008

• 肿瘤生物学 • 上一篇    下一篇

基于机器学习转移性鼻咽癌关键特征基因的筛选及其免疫细胞浸润分析

 陆进1,2,3  陈云帆1,2  张浩轩2,3  黄学应1*   

  1. 1.安徽医科大学人体解剖学教研室,合肥 230032; 2.蚌埠医学院人体解剖学教研室,安徽 蚌埠  233030;3.数字医学与智慧健康安徽省重点实验室,安徽 蚌埠   233030
  • 收稿日期:2023-03-27 修回日期:2023-11-20 出版日期:2024-06-06 发布日期:2024-06-11
  • 通讯作者: 黄学应 E-mail:15395250832@163.com
  • 基金资助:
    蚌埠医学院省重点实验室开放课题

Screen of key characteristic genes and analysis of immune cell infiltration in metastatic nasopharyngeal carcinoma base on machine learning#br#
#br#

LU Jin1,2,3 CHEN Yun-fan1,2 ZHANG Hao-xuan2,3 HUANG Xue-ying1*   

  1. 1.Department of Human Anatomy, Anhui Medical University, Hefei   230032, China; 2.Department of Human Anatomy, Bengbu Medical College, Anhui Bengbu   233030, China; 3.The Key Laboratory of Digital Medicine and Wisdom Health in Anhui Province, Anhui Bengbu   233030, China
  • Received:2023-03-27 Revised:2023-11-20 Online:2024-06-06 Published:2024-06-11
  • Contact: HUANG Xue-ying E-mail:15395250832@163.com

摘要:

目的 利用机器学习算法筛选转移性鼻咽癌 (mNPC) 的关键特征基因,并分析其肿瘤微环境中免疫细胞浸润情况。方法 首先,通过GEO数据库下载训练集GSE103611数据,并对数据进行差异表达基因 (DEGs) 筛选、基因本体论 (GO) 、京都基因与基因组百科全书 (KEGG) 以及免疫细胞浸润分析。其次,通过最小绝对收缩和选择器操作(LASSO)回归筛选DEGs中的预测基因,并利用预测基因的表达水平和受试者工作特征曲线(ROC)筛选特征基因。再次,进一步分析特征基因与免疫细胞的相关性,从而判断关键特征基因。最后,利用反向验证集GSE1245数据,对关键特征基因的表达水平和ROC进行验证。   结果 共获得136个DEGs,其KEGG主要富集在细胞色素P450、肿瘤坏死因子(TNF)信号通路、朊病毒疾病以及EB病毒感染等通路。GO主要富集在肽基酪氨酸磷酸化修饰、病毒基因表达以及B细胞和白细胞活化的负调节过程。22种免疫细胞在鼻咽癌(NPC)和mNPC中的浸润程度差异不明显。LASSO回归最终得到2个mNPC的关键特征基因无精子蛋白1缺失(DAZ1)和酵母氨酸脱氢酶(SCCPDH),且两者与mNPC微环境中的免疫细胞显著相关 (P<0.05)。在反向验证数据集中,DAZ1和SCCPDH在非鼻咽癌(nNPC)和NPC组间的差异表达不显著 (P>0.05),且两者ROC的曲线下面积(AUC)值均<0.6。  结论 DAZ1和SCCPDH是mNPC的关键特征基因,可作为mNPC及其免疫治疗的重要标志物。

关键词: 转移性鼻咽癌, 免疫细胞浸润, 生物信息学, 机器学习

Abstract:

Objective  To screen the key characteristic genes of metastatic nasopharyngeal carcinoma (mNPC) and analyze the immune cell infiltration in tumor microenvironment using machine learning algorithm. Methods   Firstly, the training set GSE103611 was downloaded from the GEO database, and the data were subjected to differential expression gene (DEGs) screening, Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genome (KEGG) and immune cell infiltration analysis. Second, the predicted genes in DEGs were screened by least absolute shrinkage and selector operation( LASSO) regression, and the characteristic genes were screened by using the expression level of the predicted genes and receiver operating characteristic(ROC). Third, the correlation between characteristic genes and immune cells was further analyzed to determine the key characteristic genes. Finally, the expression levels of key characteristic genes and ROC were verified using the reverse validation set GSE1245 data. Results  A total of 136 DEGs were obtained, and their KEGG were mainly enriched in cytochrome P450, tumor necrosis factor(TNF) signaling pathway, prion disease, EB virus infection, and other pathways. GO was mainly enriched in the negative regulatory processes of peptide-based tyrosine phosphorylation modification, viral gene expression, and B cell and leukocyte activation. The difference in the degree of infiltration of the 22 immune cells in nasopharyngeal carcinoma(NPC) and mNPC was not significant. Two key characteristic genes (DAZ1 and SCCPDH) of mNPC were finally obtained by LASSO regression, and they were significantly correlated with immune cells in the mNPC microenvironment (P<0.05). In the reverse validation data set, the differential expressions of DAZ1 and SCCPDH between non\|NPC(nNPC) and NPC groups were not significant (P>0.05), and the AUC values of ROC of both were less than 0.6. Conclusion   DAZ1 and SCCPDH are the key characteristic genes of mNPC and can be used as important markers for mNPC and immunotherapy.

Key words: Metastatic nasopharyngeal carcinoma, Immune cell infiltration, Bioinformatics, Machine learning

中图分类号: