解剖学报 ›› 2020, Vol. 51 ›› Issue (5): 653-658.doi: 10.16098/j.issn.0529-1356.2020.05.003

• 解剖学与耳鼻喉科学 • 上一篇    下一篇

颞骨CT内面神经、迷路、听骨结构深度学习的自动化分割方法

柯嘉1 吕弈2 杜雅丽1 王君臣2 王江1 孙世龙1 马芙蓉1*   

  1. 1. 北京大学第三医院耳鼻咽喉头颈外科,北京 100191;2. 北京航空航天大学机械工程及自动化学院,北京 100191
  • 收稿日期:2020-04-07 修回日期:2020-05-20 出版日期:2020-10-06 发布日期:2020-10-06
  • 通讯作者: 马芙蓉 E-mail:furongma@126.com
  • 基金资助:
    北京大学第三医院院临床重点项目;首都卫生发展科研专项项目资助;国家自然科学基金

Automatic segmentation of facial nerve, labyrinthine and ossicles in temporal CT by deep learning

KE Jia1 Lü Yi2 DU Ya-li1 WANG Jun-chen2 WANG Jiang1 SUN Shi-long1 MA Fu-rong1*   

  1. 1.Department of Otorhinolarygology Head and Neck Surgery, Third Hospital, Peking University, Beijing 100191,China; 2.School of Mechanical Engineering and Automation,Beihang University,Beijing 100191,China
  • Received:2020-04-07 Revised:2020-05-20 Online:2020-10-06 Published:2020-10-06
  • Contact: MA Fu-rong E-mail:furongma@126.com

摘要:

目的 探讨神经网络的深度学习方法,进行颞骨CT内面神经、迷路及听骨结构的自动化分割的可行性和精确性。 方法 选择常规颞骨CT检查患者的数据,随机分为两组,一组为训练集(20例),另一组为测试集(5例)。在上述颞骨CT中采用手工分割的方法,分割出迷路、听骨及面神经结构。选择三维卷积神经网络3D U-Net作为深度学习中的神经网络结构部分,通过对训练集的训练,得到该网络的平均精度。用该网络模型对5组测试集中的不同解剖标志自动分割的结果与手工分割的结果进行测试,分别获得面神经、迷路及听小骨的测试精度。并将上述精度与另一种基于三维卷积神经网络结构的V-Net网络模型获得的精度进行比较。 结果 在颞骨CT标本中,采用面神经、迷路及听小骨分别对3D U-Net-plus和V-Net网络结构的自动分割进行训练,在训练样本中,3D U-Net-plus网络结构的平均误差为0.016,V-Net网络结构的平均误差为0.035,两者差异有统计学意义(P<0.05);利用3D U-Net-plus神经网络自动分割的迷路、听小骨及面神经与手工分割图像的Dice相似指数分别为0.618±0.107、0.584±0.089和0.313±0.069, 利用V-Net神经网络自动分割的迷路、听小骨、面神经与手工分割图像的Dice相似指数分别为0.322±0.089、0.176±0.100和0.128± 0.077,两者差异有统计学意义(P<0.001)。 结论 采用3D U-Net-plus神经网络,在颞骨内听骨、迷路及面神经的自动识别和分割方面具有可行性,该方法优于V-Net神经网络。随着网络结构的优化和学习样本的扩大,其将更加接近人工分割的效果。

关键词: 深度学习, 卷积神经网络, 颞骨, 医学影像识别, 自动分割,

Abstract:

Objective To study the effect of deep learning based on neural network on automatic segmentation of facial nerve, labyrinth and ossicles in temporal CT.  Methods The data of patients with conventional temporal bone CT examination were randomly divided into two groups, one was the training set (20 cases) and the other was the test set (5 cases). The structures of labyrinth, ossicles and facial nerve were segmented manually. The convolutional neural network 3D U-Net was selected as the neural network structure part in deep learning, and the average accuracy of the network was obtained through the training of the training set. The result  of automatic and manual segmentation of 3 above anatomical markers in 5 test sets were tested by two network model, and the accuracy of facial nerve, labyrinth and ossicles were obtained respectively. The accuracy was compared with that obtained by the other 3D convolutional neural network V-Net network model.  Results In the temporal CT, facial nerve, labyrinth and ossicles were used to train the automatic segmentation of 3D U-Net-plus and V-Net network respectively. In the training samples, the mean error of 3D U-Net-plus network was 0.016, and 0.035 by V-Net network, the difference was significant, P<0.05. The Dice similarity coefficient of labyrinth, ossicles and facial nerve with manual segmented images by 3D U-Net-plus neural network were 0.618±0.107, 0.584±0.089 and 0.313±0.069, and 0.322±0.089, 0.176±0.100 and 0.128± 0.077 by V-Net neural network. The segmentation effect of 3D U-Net-plus neural network was significantly better than that by V-Net network, P<0.001.  Conclusion Using 3D U-Net-plus neural network, the ossicles, labyrinth and facial nerves in the temporal CT can be recognized and automatics segmantation quickly and effectively. This method  is better than V-Net neural network. and more close to manuall segmentation. With the optimization of network structure and the expansion of learning samples, it will be closer to the effect of manual segmentation.

Key words: Deep learning, Convolutional neural network, Temporal bone, Medical image recognition, Automatic segmentation, Human

中图分类号: