TY - JOUR
T1 - Evaluation of deep learning-based multiparametric MRI oropharyngeal primary tumor auto-segmentation and investigation of input channel effects
T2 - Results from a prospective imaging registry
AU - Wahid, Kareem A
AU - Ahmed, Sara
AU - He, Renjie
AU - van Dijk, Lisanne V
AU - Teuwen, Jonas
AU - McDonald, Brigid A
AU - Salama, Vivian
AU - Mohamed, Abdallah S R
AU - Salzillo, Travis
AU - Dede, Cem
AU - Taku, Nicolette
AU - Lai, Stephen Y
AU - Fuller, Clifton D
AU - Naser, Mohamed A
N1 - © 2021 The Authors.
PY - 2022/1
Y1 - 2022/1
N2 - BACKGROUND/PURPOSE: Oropharyngeal cancer (OPC) primary gross tumor volume (GTVp) segmentation is crucial for radiotherapy. Multiparametric MRI (mpMRI) is increasingly used for OPC adaptive radiotherapy but relies on manual segmentation. Therefore, we constructed mpMRI deep learning (DL) OPC GTVp auto-segmentation models and determined the impact of input channels on segmentation performance.MATERIALS/METHODS: GTVp ground truth segmentations were manually generated for 30 OPC patients from a clinical trial. We evaluated five mpMRI input channels (T2, T1, ADC, Ktrans, Ve). 3D Residual U-net models were developed and assessed using leave-one-out cross-validation. A baseline T2 model was compared to mpMRI models (T2 + T1, T2 + ADC, T2 + Ktrans, T2 + Ve, all five channels [ALL]) primarily using the Dice similarity coefficient (DSC). False-negative DSC (FND), false-positive DSC, sensitivity, positive predictive value, surface DSC, Hausdorff distance (HD), 95% HD, and mean surface distance were also assessed. For the best model, ground truth and DL-generated segmentations were compared through a blinded Turing test using three physician observers.RESULTS: Models yielded mean DSCs from 0.71 ± 0.12 (ALL) to 0.73 ± 0.12 (T2 + T1). Compared to the T2 model, performance was significantly improved for FND, sensitivity, surface DSC, HD, and 95% HD for the T2 + T1 model (p < 0.05) and for FND for the T2 + Ve and ALL models (p < 0.05). No model demonstrated significant correlations between tumor size and DSC (p > 0.05). Most models demonstrated significant correlations between tumor size and HD or Surface DSC (p < 0.05), except those that included ADC or Ve as input channels (p > 0.05). On average, there were no significant differences between ground truth and DL-generated segmentations for all observers (p > 0.05).CONCLUSION: DL using mpMRI provides reasonably accurate segmentations of OPC GTVp that may be comparable to ground truth segmentations generated by clinical experts. Incorporating additional mpMRI channels may increase the performance of FND, sensitivity, surface DSC, HD, and 95% HD, and improve model robustness to tumor size.
AB - BACKGROUND/PURPOSE: Oropharyngeal cancer (OPC) primary gross tumor volume (GTVp) segmentation is crucial for radiotherapy. Multiparametric MRI (mpMRI) is increasingly used for OPC adaptive radiotherapy but relies on manual segmentation. Therefore, we constructed mpMRI deep learning (DL) OPC GTVp auto-segmentation models and determined the impact of input channels on segmentation performance.MATERIALS/METHODS: GTVp ground truth segmentations were manually generated for 30 OPC patients from a clinical trial. We evaluated five mpMRI input channels (T2, T1, ADC, Ktrans, Ve). 3D Residual U-net models were developed and assessed using leave-one-out cross-validation. A baseline T2 model was compared to mpMRI models (T2 + T1, T2 + ADC, T2 + Ktrans, T2 + Ve, all five channels [ALL]) primarily using the Dice similarity coefficient (DSC). False-negative DSC (FND), false-positive DSC, sensitivity, positive predictive value, surface DSC, Hausdorff distance (HD), 95% HD, and mean surface distance were also assessed. For the best model, ground truth and DL-generated segmentations were compared through a blinded Turing test using three physician observers.RESULTS: Models yielded mean DSCs from 0.71 ± 0.12 (ALL) to 0.73 ± 0.12 (T2 + T1). Compared to the T2 model, performance was significantly improved for FND, sensitivity, surface DSC, HD, and 95% HD for the T2 + T1 model (p < 0.05) and for FND for the T2 + Ve and ALL models (p < 0.05). No model demonstrated significant correlations between tumor size and DSC (p > 0.05). Most models demonstrated significant correlations between tumor size and HD or Surface DSC (p < 0.05), except those that included ADC or Ve as input channels (p > 0.05). On average, there were no significant differences between ground truth and DL-generated segmentations for all observers (p > 0.05).CONCLUSION: DL using mpMRI provides reasonably accurate segmentations of OPC GTVp that may be comparable to ground truth segmentations generated by clinical experts. Incorporating additional mpMRI channels may increase the performance of FND, sensitivity, surface DSC, HD, and 95% HD, and improve model robustness to tumor size.
UR - https://www.scopus.com/pages/publications/85122506275
U2 - 10.1016/j.ctro.2021.10.003
DO - 10.1016/j.ctro.2021.10.003
M3 - Article
C2 - 34765748
SN - 2405-6308
VL - 32
SP - 6
EP - 14
JO - Clinical and Translational Radiation Oncology
JF - Clinical and Translational Radiation Oncology
ER -