Improving automatic delineation for head and neck organs at risk by Deep Learning Contouring

Lisanne V van Dijk*, Lisa Van den Bosch, Paul Aljabar, Devis Peressutti, Stefan Both, Roel J H M Steenbakkers, Johannes A Langendijk, Mark J Gooding, Charlotte L Brouwer

*Bijbehorende auteur voor dit werk

OnderzoeksoutputAcademicpeer review

102 Citaten (Scopus)
220 Downloads (Pure)


INTRODUCTION: Adequate head and neck (HN) organ-at-risk (OAR) delineation is crucial for HN radiotherapy and for investigating the relationships between radiation dose to OARs and radiation-induced side effects. The automatic contouring algorithms that are currently in clinical use, such as atlas-based contouring (ABAS), leave room for improvement. The aim of this study was to use a comprehensive evaluation methodology to investigate the performance of HN OAR auto-contouring when using deep learning contouring (DLC), compared to ABAS.

METHODS: The DLC neural network was trained on 589 HN cancer patients. DLC was compared to ABAS by providing each method with an independent validation cohort of 104 patients, which had also been manually contoured. For each of the 22 OAR contours - glandular, upper digestive tract and central nervous system (CNS)-related structures - the dice similarity coefficient (DICE), and absolute mean and max dose differences (|Δmean-dose| and |Δmax-dose|) performance measures were obtained. For a subset of 7 OARs, an evaluation of contouring time, inter-observer variation and subjective judgement was performed.

RESULTS: DLC resulted in equal or significantly improved quantitative performance measures in 19 out of 22 OARs, compared to the ABAS (DICE/|Δmean dose|/|Δmax dose|: 0.59/4.2/4.1 Gy (ABAS); 0.74/1.1/0.8 Gy (DLC)). The improvements were mainly for the glandular and upper digestive tract OARs. DLC significantly reduced the delineation time for the inexperienced observer. The subjective evaluation showed that DLC contours were more often preferable to the ABAS contours overall, were considered to be more precise, and more often confused with manual contours. Manual contours still outperformed both DLC and ABAS; however, DLC results were within or bordering the inter-observer variability for the manual edited contours in this cohort.

CONCLUSION: The DLC, trained on a large HN cancer patient cohort, outperformed the ABAS for the majority of HN OARs.

Originele taal-2English
Pagina's (van-tot)115-123
Aantal pagina's9
TijdschriftRadiotherapy and Oncology
Vroegere onlinedatum22-okt.-2019
StatusPublished - jan.-2020

Citeer dit