Abstract
Purpose: Artificial intelligence (AI) could reduce lung cancer screening computer tomography (CT)-reading workload if used as a first-reader, ruling-out negative CT-scans at baseline. Evidence is lacking to support AI performance when compared to gold-standard lung cancer outcomes. This study validated the performance of a commercially available AI software in the UK lung cancer screening (UKLS) trial dataset, with comparison to human reads and histological lung cancer outcomes, and estimated CT-reading workload reduction.
Methods: 1252 UKLS-baseline-CT-scans were evaluated independently by AI and human readers. AI performance was evaluated on two-levels. Firstly, AI classification and individual reads were compared to a EU reference standard (based on NELSON2.0-European Position Statement) determined by a European expert panel blinded from individual results. A positive misclassification was defined as a nodule positive read ≥ 100mm3 and no/<100mm3 nodules in the expert read; A negative misclassification was defined as a nodule negative read, whereas an indeterminate or positive finding in the expert read. Secondly, AI nodule classification was compared to gold-standard histological lung cancer outcomes. CT-reading workload reduction was calculated from AI negative CT-scans when AI was used as first-reader.
Results: Expert panel reference standard reported 815 (65 %) negative and 437 (35 %) indeterminate/positive CT-scans in the dataset of 1252 UKLS-participants. Compared to the reference standard, AI resulted in less misclassification than human reads, NPV 92·0 %(90·2 %-95·3 %). On comparison to gold-standard, AI detected all 31 baseline-round lung cancers, but classified one as negative due to the 100mm3 threshold, NPV 99·8 %(99·0 %-99·9 %). Estimated maximum CT-reading workload reduction was 79 %.
Conclusion: Implementing AI as first-reader to rule-out negative CT-scans, shows considerable potential to reduce CT-reading workload and does not lead to missed lung cancers.
Methods: 1252 UKLS-baseline-CT-scans were evaluated independently by AI and human readers. AI performance was evaluated on two-levels. Firstly, AI classification and individual reads were compared to a EU reference standard (based on NELSON2.0-European Position Statement) determined by a European expert panel blinded from individual results. A positive misclassification was defined as a nodule positive read ≥ 100mm3 and no/<100mm3 nodules in the expert read; A negative misclassification was defined as a nodule negative read, whereas an indeterminate or positive finding in the expert read. Secondly, AI nodule classification was compared to gold-standard histological lung cancer outcomes. CT-reading workload reduction was calculated from AI negative CT-scans when AI was used as first-reader.
Results: Expert panel reference standard reported 815 (65 %) negative and 437 (35 %) indeterminate/positive CT-scans in the dataset of 1252 UKLS-participants. Compared to the reference standard, AI resulted in less misclassification than human reads, NPV 92·0 %(90·2 %-95·3 %). On comparison to gold-standard, AI detected all 31 baseline-round lung cancers, but classified one as negative due to the 100mm3 threshold, NPV 99·8 %(99·0 %-99·9 %). Estimated maximum CT-reading workload reduction was 79 %.
Conclusion: Implementing AI as first-reader to rule-out negative CT-scans, shows considerable potential to reduce CT-reading workload and does not lead to missed lung cancers.
Original language | English |
---|---|
Article number | 115324 |
Number of pages | 9 |
Journal | European Journal of Cancer |
Volume | 220 |
DOIs | |
Publication status | Published - 2-May-2025 |