Evaluation of a novel deep learning-based classifier for perifissural nodules

Daiwei Han, Marjolein Heuvelmans*, Mieneke Rook, Monique Dorrius, Luutsen van Houten, Noah Waterfield Price, Lyndsey C Pickup, Petr Novotny, Matthijs Oudkerk, Jerome Declerck, Fergus Gleeson, Peter van Ooijen, Rozemarijn Vliegenthart

*Bijbehorende auteur voor dit werk

Onderzoeksoutput: ArticleAcademicpeer review

53 Downloads (Pure)


OBJECTIVES: To evaluate the performance of a novel convolutional neural network (CNN) for the classification of typical perifissural nodules (PFN).

METHODS: Chest CT data from two centers in the UK and The Netherlands (1668 unique nodules, 1260 individuals) were collected. Pulmonary nodules were classified into subtypes, including "typical PFNs" on-site, and were reviewed by a central clinician. The dataset was divided into a training/cross-validation set of 1557 nodules (1103 individuals) and a test set of 196 nodules (158 individuals). For the test set, three radiologically trained readers classified the nodules into three nodule categories: typical PFN, atypical PFN, and non-PFN. The consensus of the three readers was used as reference to evaluate the performance of the PFN-CNN. Typical PFNs were considered as positive results, and atypical PFNs and non-PFNs were grouped as negative results. PFN-CNN performance was evaluated using the ROC curve, confusion matrix, and Cohen's kappa.

RESULTS: Internal validation yielded a mean AUC of 91.9% (95% CI 90.6-92.9) with 78.7% sensitivity and 90.4% specificity. For the test set, the reader consensus rated 45/196 (23%) of nodules as typical PFN. The classifier-reader agreement (k = 0.62-0.75) was similar to the inter-reader agreement (k = 0.64-0.79). Area under the ROC curve was 95.8% (95% CI 93.3-98.4), with a sensitivity of 95.6% (95% CI 84.9-99.5), and specificity of 88.1% (95% CI 81.8-92.8).

CONCLUSION: The PFN-CNN showed excellent performance in classifying typical PFNs. Its agreement with radiologically trained readers is within the range of inter-reader agreement. Thus, the CNN-based system has potential in clinical and screening settings to rule out perifissural nodules and increase reader efficiency.

KEY POINTS: • Agreement between the PFN-CNN and radiologically trained readers is within the range of inter-reader agreement. • The CNN model for the classification of typical PFNs achieved an AUC of 95.8% (95% CI 93.3-98.4) with 95.6% (95% CI 84.9-99.5) sensitivity and 88.1% (95% CI 81.8-92.8) specificity compared to the consensus of three readers.

Originele taal-2English
Pagina's (van-tot)4023-4030
Aantal pagina's8
TijdschriftEuropean Radiology
Vroegere onlinedatum2-dec.-2020
StatusPublished - jun.-2021

Citeer dit