A melanoma diagnosis method based on large-scale vision-language models

doi:10.16098/j.issn.0529-1356.2025.01.003

Abstract

Abstract:

Objective To develop a melanoma diagnosis framework based on large-scale vision-language models, and to explore the feasibility and accuracy of the framework for melanoma diagnosis. Methods The publicly available Derm7pt dataset, which was divided into a training set (346 cases), a validation set (161 cases), and a test set (320 cases) was utilized. A melanoma diagnosis framework based on large-scale vision-language models was proposed, comprising two text branches and one visual branch. In the text branches, one branch processed fixed clinical prompts, while the other handled learnable prompts. This design aimed to optimize the effectiveness of learnable prompts through guidance from fixed clinical prompts. The visual branch processed dermoscopic images and enhanced melanoma feature recognition through fine-tuning the image encoder. Results On the Derm7pt dataset, our method outperformd other existing method. It achieved an area under the receiver operating characteristic curve (AUC) of 87.35%, an accuracy of 84.17%, and an F1-score of 84.01%. Conclusion The study demonstrates that with appropriate fine-tuning strategies, methods based on large-scale vision-language pre-trained models can effectively adapt to melanoma diagnosis tasks. This approach can serve as a powerful auxiliary tool for doctors, helping them make more accurate diagnostic decisions.

Key words: Melanoma, Large-scale vision-language model, Fine-tuning, Diagnosis, Deep learning, Human

CLC Number:

TP391

ZHAO Jia-yue LI Shi-man ZHANG Chen-xi. A melanoma diagnosis method based on large-scale vision-language models[J]. Acta Anatomica Sinica, 2025, 56(1): 22-29.

References

［1］Arnold M, Singh D, Laversanne M, et al. Global burden of cutaneous melanoma in 2020 and projections to 2040［J］. JAMA Dermatol, 2022,158(5):495-503.

［2］Naik PP. Cutaneous malignant melanoma: a review of early diagnosis and management［J］. World J Oncol, 2021,12(1):7-19.

［3］Long GV, Swetter SM, Menzies AM, et al. Cutaneous melanoma［J］. Lancet, 2023, 402(10400): 485-502.

［4］He XC, Jin L, Li M, et al. Complete Box Fusion based on Ensemble Networks for rib fracture detection and localization［J］. Acta Anatomica Sinica, 2022, 53(3): 396-401. (in Chinese)

何学才, 金倞, 李铭, 等. 基于完全融合集成网络候选框的肋骨骨折检测方法［J］. 解剖学报, 2022, 53(3): 396-401.

［5］Deda LC, Goldberg RH, Jamerson TA, et al. Dermoscopy practice guidelines for use in telemedicine［J］. NPJ Digit Med, 2022,5(1):55.

［6］Thomas L, Puig S. Dermoscopy, digital dermoscopy and other diagnostic tools in the early detection of melanoma and follow-up of high-risk skin cancer patients［J］. Acta Derm Venereol, 2017, 97:14-21.

［7］Chatterjee S, Dey S, Munshi S. Integration of morphological preprocessing and fractal based feature extraction with recursive feature elimination for skin lesion types classification［J］. Comput Methods Programs Biomed,2019,178:201-218.

［8］Balasubramaniam V. Artificial intelligence algorithm with SVM classification using dermascopic images for melanoma diagnosis［J］. J Artif Intell Capsule Netw, 2021, 3(1): 34-42.

［9］Jojoa Acosta MF, Caballero Tovar LY, Garcia-Zapirain MB, et al. Melanoma diagnosis using deep learning techniques on dermatoscopic images［J］. BMC Med Imaging, 2021, 21(1): 6.

［10］Zhang J, Huang J, Jin S, et al. Vision-language models for vision tasks: A survey［J］. IEEE Trans Pattern Anal Mach Intell, 2024,46(8):5625-5644.

［11］Radford A, Kim JW, Hallacy C, et al. Learning transferable visual models from natural language supervision［C］. Proceedings of Machine Learning Research (PCLR), 2021, 139:8748-8763.

［12］Zhou K, Yang J, Loy CC, et al. Learning to prompt for vision-language models［J］. Int J Comput Vis, 2022,130:2337-2348.

［13］Yao H, Zhang R, Xu C. Visual-language prompt tuning with knowledge-guided context optimization［C］. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023:6757-6767.

［14］Kawahara J, Daneshvar S, Argenziano G, et al. Seven-point checklist and skin lesion classification using multitask multimodal neural nets［J］. IEEE J Biomed Health Inform, 2019,23(2):538-546.

［15］Patrício C, Neves JC, Teixeira LF. Coherent concept-based explanations in medical image and its application to skin lesion diagnosis［C］. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023: 3798-3807.

［16］Bie Y, Luo L, Chen H. MICA: towards explainable skin lesion diagnosis via multi-level image-concept alignment［C］. Proceedings of the AAAI Conference Artificial Intelligence (AAAI), 2024,38(2):837-845.

［17］Harrington E, Clyne B, Wesseling N, et al. Diagnosing malignant melanoma in ambulatory care: a systematic review of clinical prediction rules［J］. BMJ Open, 2017,7(3):e014096.

［18］Sarkar A, Vijaykeerthy D, Sarkar A, et al. A Framework for learning ante-hoc explainable models via concepts［C］. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022:10276-10285.

［19］Patrício C, Neves JC, Teixeira LF. Coherent concept-based explanations in medical image and its application to skin lesion diagnosis［C］. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2023:3799-3808.

［20］Yuksekgonul M, Wang M, Zou J, et al. Post-hoc concept bottleneck models［C］. The Eleventh International Conference on Learning Representations (ICLR), 2023.

［21］Adegun A, Viriri S. Deep learning techniques for skin lesion analysis and melanoma cancer detection: a survey of state-of-the-art［J］. Artif Intell Rev, 2021, 54(2): 811-841.

[1]	WANG Tao ZHAO Yu-wei PAN Xin MA Yun-sheng. Construction of transcription-factor-microRNA-mRNA regulatory network during the induction of insulin-producing cells by bioinformatics methods [J]. Acta Anatomica Sinica, 2025, 56(1): 80-87.
[2]	LI Na LI Tao YAO Yuan LI Jing ZHUANG Qian-yu . Mechanism of microRNA-181b-5p regulating osteogenic differentiation of bone marrow mesenchymal stem cells through Sprouty 4 [J]. Acta Anatomica Sinica, 2024, 55(6): 708-714.
[3]	SU Xiu-yun HE Jie ZHAO Wei SUN De-shun LI Heng OU Yi-yi PEI Guo-xian . Anatomical principal variations of the human pelvic ring using statistic shape model [J]. Acta Anatomica Sinica, 2024, 55(6): 715-720.
[4]	CHEN Yong-zheng HU Zhen-hua LI Shao-juan LIANG Xia-cun HOU Li-kang ZHU Shu-liang BAI Xin-ying HE Jin-jian YANG De-meng CHEN Zhi-guo . Localization and anatomical measurement of lateral compression Ⅱ screw guide needle insertion point for pelvic fracture [J]. Acta Anatomica Sinica, 2024, 55(6): 728-733.
[5]	WANG Xue CHEN Zhen WU Juan-li WANG Nai-li ZHANG Di DU Juan YU Liang DUAN Wan-ru LIU Peng-hao6 ZHANG Han-lin HUANG Can PIAO Yue-shan ZHU Ke-qing BAO Ai-min ZHANG Jing SHEN Yi MA Chao QIU Wen-ying QIAN Xiao-jing . Standardized operational protocol for the China Human Brain Bank Consortium (2nd edition) [J]. Acta Anatomica Sinica, 2024, 55(6): 734-745.
[6]	LI Yong-lan YU Hui-xin YU Ke-li ZHANG Xing-hua BAO Jin-ping ZHENG Lian-bin. General characteristics of Chinese ethnic groups based on body index value [J]. Acta Anatomica Sinica, 2024, 55(5): 619-624.
[7]	ZU Gao-yu LI Feng-jiao XIAN Wei-wei GUO Yang-yang ZHAO Bai-cheng LI Wen-sheng YOU Lin-ya. Differential expression analysis of the transcriptome for human basal ganglia from normal donors and Parkinson’s disease patients [J]. Acta Anatomica Sinica, 2024, 55(4): 482-492.
[8]	DU Juan MI Shi-xiong JIN Yu-chuan YANG Qian MA Min ZHAO Xue-ru LIU Feng-cang ZHAO Chang-yi ZHANG Zhan-chi FAN Ping CUI Hui-xian. Analysis of Human Brain Bank samples from Hebei Medical University [J]. Acta Anatomica Sinica, 2024, 55(4): 437-444.
[9]	SUN Chen-xi LIU Tian-ci YIN Chang-qing LIU Shu-wei. Analysis of cerebral blood flow perfusion in newly diagnosed early-onset depression using 3D pseudo-continuous arterial spin labeling MRI [J]. Acta Anatomica Sinica, 2024, 55(4): 493-500.
[10]	HUANG Chen-ye LI Xiang-jun XIE Dao-jun. Changes in diffusion tensor image analysis along the perivascular space index during aging [J]. Acta Anatomica Sinica, 2024, 55(4): 501-507.
[11]	HUANG Chen-ye, LI Xiang-jun, XIE Dao-jun. Differences in robustness of brain structure covariance networks among individuals with different circadian rhythm preferences [J]. Acta Anatomica Sinica, 2024, 55(4): 508-514.
[12]	WANG Ping WANG Jing CHEN Xian-bing CHEN Xi. Effect of sperm DNA fragmentation on in vitro fertilization-embryo transfer outcome among middle-aged men [J]. Acta Anatomica Sinica, 2024, 55(3): 345-348.
[13]	SUN Si-yu LI Yong-lan ZHENG Lian-bin YU Ke-li . Analysis of human composition of the Gannan Tibetans [J]. Acta Anatomica Sinica, 2024, 55(3): 349-355.
[14]	ZHONG Wei-xing WANG Zhi-hong LI Jun-hua LIAO Li-qing CHEN Zu-jiang LI Yi-kai. Observation and imaging analysis of signs of ankylosing spondylitis in spinal specimens#br# #br# [J]. Acta Anatomica Sinica, 2024, 55(3): 329-333.
[15]	LI Ming-hui XIONG Ji-xiang ZHOU Xin ZHANG Lei. Digital optimization design combined with 3D printing technology for bone tunnel creation in distal tibiofibular syndesmosis injury [J]. Acta Anatomica Sinica, 2024, 55(3): 334-338.