Acta Anatomica Sinica ›› 2020, Vol. 51 ›› Issue (5): 653-658.doi: 10.16098/j.issn.0529-1356.2020.05.003

Previous Articles     Next Articles

Automatic segmentation of facial nerve, labyrinthine and ossicles in temporal CT by deep learning

KE Jia1 Lü Yi2 DU Ya-li1 WANG Jun-chen2 WANG Jiang1 SUN Shi-long1 MA Fu-rong1*   

  1. 1.Department of Otorhinolarygology Head and Neck Surgery, Third Hospital, Peking University, Beijing 100191,China; 2.School of Mechanical Engineering and Automation,Beihang University,Beijing 100191,China
  • Received:2020-04-07 Revised:2020-05-20 Online:2020-10-06 Published:2020-10-06
  • Contact: MA Fu-rong E-mail:furongma@126.com

Abstract:

Objective To study the effect of deep learning based on neural network on automatic segmentation of facial nerve, labyrinth and ossicles in temporal CT.  Methods The data of patients with conventional temporal bone CT examination were randomly divided into two groups, one was the training set (20 cases) and the other was the test set (5 cases). The structures of labyrinth, ossicles and facial nerve were segmented manually. The convolutional neural network 3D U-Net was selected as the neural network structure part in deep learning, and the average accuracy of the network was obtained through the training of the training set. The result  of automatic and manual segmentation of 3 above anatomical markers in 5 test sets were tested by two network model, and the accuracy of facial nerve, labyrinth and ossicles were obtained respectively. The accuracy was compared with that obtained by the other 3D convolutional neural network V-Net network model.  Results In the temporal CT, facial nerve, labyrinth and ossicles were used to train the automatic segmentation of 3D U-Net-plus and V-Net network respectively. In the training samples, the mean error of 3D U-Net-plus network was 0.016, and 0.035 by V-Net network, the difference was significant, P<0.05. The Dice similarity coefficient of labyrinth, ossicles and facial nerve with manual segmented images by 3D U-Net-plus neural network were 0.618±0.107, 0.584±0.089 and 0.313±0.069, and 0.322±0.089, 0.176±0.100 and 0.128± 0.077 by V-Net neural network. The segmentation effect of 3D U-Net-plus neural network was significantly better than that by V-Net network, P<0.001.  Conclusion Using 3D U-Net-plus neural network, the ossicles, labyrinth and facial nerves in the temporal CT can be recognized and automatics segmantation quickly and effectively. This method  is better than V-Net neural network. and more close to manuall segmentation. With the optimization of network structure and the expansion of learning samples, it will be closer to the effect of manual segmentation.

Key words: Deep learning, Convolutional neural network, Temporal bone, Medical image recognition, Automatic segmentation, Human

CLC Number: