TY - JOUR
T1 - Three-Dimensional Deep Learning Normal Tissue Complication Probability Model to Predict Late Xerostomia in Patients With Head and Neck Cancer
AU - Chu, Hung
AU - de Vette, Suzanne P M
AU - Neh, Hendrike
AU - Sijtsema, Nanna M
AU - Steenbakkers, Roel J H M
AU - Moreno, Amy
AU - Langendijk, Johannes A
AU - van Ooijen, Peter M A
AU - Fuller, Clifton D
AU - van Dijk, Lisanne V
N1 - Copyright © 2024 The Author(s). Published by Elsevier Inc. All rights reserved.
PY - 2025/1/1
Y1 - 2025/1/1
N2 - PURPOSE: Conventional normal tissue complication probability (NTCP) models for patients with head and neck cancer are typically based on single-value variables, which, for radiation-induced xerostomia, are baseline xerostomia and mean salivary gland doses. This study aimed to improve the prediction of late xerostomia by using 3-dimensional information from radiation dose distributions, computed tomography imaging, organ-at-risk segmentations, and clinical variables with deep learning (DL).METHODS AND MATERIALS: An international cohort of 1208 patients with head and neck cancer from 2 institutes was used to train and twice validate DL models (deep convolutional neural network, EfficientNet-v2, and ResNet) with 3-dimensional dose distribution, computed tomography scan, organ-at-risk segmentations, baseline xerostomia score, sex, and age as input. The NTCP endpoint was moderate-to-severe xerostomia 12 months postradiation therapy. The DL models' prediction performance was compared with a reference model: a recently published xerostomia NTCP model that used baseline xerostomia score and mean salivary gland doses as input. Attention maps were created to visualize the focus regions of the DL predictions. Transfer learning was conducted to improve the DL model performance on the external validation set.RESULTS: All DL-based NTCP models showed better performance (area under the receiver operating characteristic curve [AUC] test, 0.78-0.79) than the reference NTCP model (AUC test, 0.74) in the independent test. Attention maps showed that the DL model focused on the major salivary glands, particularly the stem cell-rich region of the parotid glands. DL models obtained lower external validation performance (AUC external, 0.63) than the reference model (AUC external, 0.66). After transfer learning on a small external subset, the DL model (AUC tl, external, 0.66) performed better than the reference model (AUC tl, external, 0.64). CONCLUSION: DL-based NTCP models performed better than the reference model when validated in data from the same institute. Improved performance in the external data set was achieved with transfer learning, demonstrating the need for multicenter training data to realize generalizable DL-based NTCP models.
AB - PURPOSE: Conventional normal tissue complication probability (NTCP) models for patients with head and neck cancer are typically based on single-value variables, which, for radiation-induced xerostomia, are baseline xerostomia and mean salivary gland doses. This study aimed to improve the prediction of late xerostomia by using 3-dimensional information from radiation dose distributions, computed tomography imaging, organ-at-risk segmentations, and clinical variables with deep learning (DL).METHODS AND MATERIALS: An international cohort of 1208 patients with head and neck cancer from 2 institutes was used to train and twice validate DL models (deep convolutional neural network, EfficientNet-v2, and ResNet) with 3-dimensional dose distribution, computed tomography scan, organ-at-risk segmentations, baseline xerostomia score, sex, and age as input. The NTCP endpoint was moderate-to-severe xerostomia 12 months postradiation therapy. The DL models' prediction performance was compared with a reference model: a recently published xerostomia NTCP model that used baseline xerostomia score and mean salivary gland doses as input. Attention maps were created to visualize the focus regions of the DL predictions. Transfer learning was conducted to improve the DL model performance on the external validation set.RESULTS: All DL-based NTCP models showed better performance (area under the receiver operating characteristic curve [AUC] test, 0.78-0.79) than the reference NTCP model (AUC test, 0.74) in the independent test. Attention maps showed that the DL model focused on the major salivary glands, particularly the stem cell-rich region of the parotid glands. DL models obtained lower external validation performance (AUC external, 0.63) than the reference model (AUC external, 0.66). After transfer learning on a small external subset, the DL model (AUC tl, external, 0.66) performed better than the reference model (AUC tl, external, 0.64). CONCLUSION: DL-based NTCP models performed better than the reference model when validated in data from the same institute. Improved performance in the external data set was achieved with transfer learning, demonstrating the need for multicenter training data to realize generalizable DL-based NTCP models.
U2 - 10.1016/j.ijrobp.2024.07.2334
DO - 10.1016/j.ijrobp.2024.07.2334
M3 - Article
C2 - 39147208
SN - 0360-3016
VL - 121
SP - 269
EP - 280
JO - International Journal of Radiation Oncology, Biology, Physics
JF - International Journal of Radiation Oncology, Biology, Physics
IS - 1
ER -