Comparative Performance of Deep Learning and Radiologists for the Diagnosis and Localization of Clinically Significant Prostate Cancer at MRI: A Systematic Review

Christian Roest*, Stefan J Fransen, Thomas C Kwee, Derya Yakar

*Corresponding author for this work

Research output: Contribution to journalReview articlepeer-review

7 Downloads (Pure)


BACKGROUND: Deep learning (DL)-based models have demonstrated an ability to automatically diagnose clinically significant prostate cancer (PCa) on MRI scans and are regularly reported to approach expert performance. The aim of this work was to systematically review the literature comparing deep learning (DL) systems to radiologists in order to evaluate the comparative performance of current state-of-the-art deep learning models and radiologists.

METHODS: This systematic review was conducted in accordance with the 2020 Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) checklist. Studies investigating DL models for diagnosing clinically significant (cs) PCa on MRI were included. The quality and risk of bias of each study were assessed using the checklist for AI in medical imaging (CLAIM) and QUADAS-2, respectively. Patient level and lesion-based diagnostic performance were separately evaluated by comparing the sensitivity achieved by DL and radiologists at an identical specificity and the false positives per patient, respectively.

RESULTS: The final selection consisted of eight studies with a combined 7337 patients. The median study quality with CLAIM was 74.1% (IQR: 70.6-77.6). DL achieved an identical patient-level performance to the radiologists for PI-RADS ≥ 3 (both 97.7%, SD = 2.1%). DL had a lower sensitivity for PI-RADS ≥ 4 (84.2% vs. 88.8%, p = 0.43). The sensitivity of DL for lesion localization was also between 2% and 12.5% lower than that of the radiologists.

CONCLUSIONS: DL models for the diagnosis of csPCa on MRI appear to approach the performance of experts but currently have a lower sensitivity compared to experienced radiologists. There is a need for studies with larger datasets and for validation on external data.

Original languageEnglish
Article number1490
Number of pages16
Issue number10
Publication statusPublished - 26-Sept-2022

Cite this