Accuracy and Precision of Mandible Segmentation and Its Clinical Implications: Virtual Reality, Desktop Screen and Artificial Intelligence

Lennart Johannes Gruber, Jan Egger, Andrea Bönsch, Joep Kraeima, Max Ulbrich, Vincent van den Bosch, Ila Motmaen, Caroline Wilpert, Mark Ooms, Peter Isfort, Frank Hölzle, Behrus Puladi*

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

4 Citations (Scopus)
21 Downloads (Pure)


Objective: 3D modeling is a major challenge in computer-assisted surgery (CAS). Manual segmentation, as the gold standard, is tedious, time consuming, and particularly challenging for the mandible, while artificial intelligence (AI)-based segmentation is a promising and time-saving alternative. However, little is known about the clinical implications of various segmentation methods. Method: In this cross-over study, ten mandibles were segmented in virtual reality (VR), on a desktop screen (DS) by five experts and via five AI models. The exported mandible models were evaluated using metrics, a public reference (PUBDS), and blinded assessments by two radiologists. Results: Average segmentation-to-volume accuracy (1 = poor, 5 = perfect) was comparable for human segmentation (VR: 4.56; DS: 4.33; PUBDS: 4.55) and significant better than AI-based segmentation (AI: 3.80), while the average segmentation-to-segmentation accuracy revealed that DS (91.4 %/0.37 mm [Dice coefficient/average Hausdorff distance]) was more comparable to PUBDS than to VR (90.1 %/0.44 mm). The precision of VR (96.8 %/0.14 mm) and DS (96.6 %/0.15 mm) was superior to PUBDS (94.1 %/0.21 mm) and the AI method (89.2 %/0.60 mm). While VR was significantly faster than DS and PUBDS for the manual segmentation methods (p = 0.007/< 0.001), in contrast, the AI method is not time sensitive due to its possible hardware scalability. Conclusion: Accuracy and precision of mandible segmentation depends primarily on CT quality and anatomical site, which should be considered in clinical applications and the generation of AI training data and could negatively impact CAS. Although current AI models have perfect intra-model reliability, they demonstrate higher inter-model variability and are accompanied by invalid outliers making human review still necessary. In summary, the use of VR in manual segmentation showed high accuracy and precision overall while saving time, making it the preferred method over DS due to its good usability.

Original languageEnglish
Article number122275
Number of pages12
JournalExpert systems with applications
Publication statusPublished - 1-Apr-2024


  • Artificial intelligence
  • Computer-assisted surgery
  • Oral and maxillofacial surgery
  • Segmentation
  • Virtual reality


Dive into the research topics of 'Accuracy and Precision of Mandible Segmentation and Its Clinical Implications: Virtual Reality, Desktop Screen and Artificial Intelligence'. Together they form a unique fingerprint.

Cite this