Abstract
Background and purpose: Head and neck (HN) radiotherapy can benefit from automatic delineation of tumor and
surrounding organs because of the complex anatomy and the regular need for adaptation. The aim of this study
was to assess the performance of a commercially available deep learning contouring (DLC) model on an external
validation set.
Materials and methods: The CT-based DLC model, trained at the University Medical Center Groningen (UMCG),
was applied to an independent set of 58 patients from the Radboud University Medical Center (RUMC). DLC
results were compared to the RUMC manual reference using the Dice similarity coefficient (DSC) and 95th
percentile of Hausdorff distance (HD95). Craniocaudal spatial information was added by calculating binned
measures. In addition, a qualitative evaluation compared the acceptance of manual and DLC contours in both
groups of observers.
Results: Good correspondence was shown for the mandible (DSC 0.90; HD95 3.6 mm). Performance was reasonable
for the glandular OARs, brainstem and oral cavity (DSC 0.78–0.85, HD95 3.7–7.3 mm). The other
aerodigestive tract OARs showed only moderate agreement (DSC 0.53–0.65, HD95 around 9 mm). The binned
measures displayed the largest deviations caudally and/or cranially.
Conclusions: This study demonstrates that the DLC model can provide a reasonable starting point for delineation
when applied to an independent patient cohort. The qualitative evaluation did not reveal large differences in the
interpretation of contouring guidelines between RUMC and UMCG observers.
surrounding organs because of the complex anatomy and the regular need for adaptation. The aim of this study
was to assess the performance of a commercially available deep learning contouring (DLC) model on an external
validation set.
Materials and methods: The CT-based DLC model, trained at the University Medical Center Groningen (UMCG),
was applied to an independent set of 58 patients from the Radboud University Medical Center (RUMC). DLC
results were compared to the RUMC manual reference using the Dice similarity coefficient (DSC) and 95th
percentile of Hausdorff distance (HD95). Craniocaudal spatial information was added by calculating binned
measures. In addition, a qualitative evaluation compared the acceptance of manual and DLC contours in both
groups of observers.
Results: Good correspondence was shown for the mandible (DSC 0.90; HD95 3.6 mm). Performance was reasonable
for the glandular OARs, brainstem and oral cavity (DSC 0.78–0.85, HD95 3.7–7.3 mm). The other
aerodigestive tract OARs showed only moderate agreement (DSC 0.53–0.65, HD95 around 9 mm). The binned
measures displayed the largest deviations caudally and/or cranially.
Conclusions: This study demonstrates that the DLC model can provide a reasonable starting point for delineation
when applied to an independent patient cohort. The qualitative evaluation did not reveal large differences in the
interpretation of contouring guidelines between RUMC and UMCG observers.
Original language | English |
---|---|
Pages (from-to) | 8-15 |
Number of pages | 8 |
Journal | Physics and Imaging in Radiation Oncology |
Volume | 15 |
Early online date | 10-Jul-2020 |
DOIs | |
Publication status | Published - 2020 |