Background Radiomics refers to the extraction of a large number of image biomarker describing the tumor phenotype displayed in a medical image. Extracted from positron emission tomography (PET) images, radiomics showed diagnostic and prognostic value for several cancer types. However, a large number of radiomic features are nonreproducible or highly correlated with conventional PET metrics. Moreover, radiomic features used in the clinic should yield relevant information about tumor texture. In this study, we propose a framework to identify technical and clinical meaningful features and exemplify our results using a PET non-small cell lung cancer (NSCLC) dataset.
Materials and methods The proposed selection procedure consists of several steps. A priori, we only include features that were found to be reproducible in a multicenter setting. Next, we apply a voxel randomization step to identify features that reflect actual textural information, that is, that yield in 90% of the patient scans a value significantly different from random texture. Finally, the remaining features were correlated with standard PET metrics to further remove redundancy with common PET metrics. The selection procedure was performed for different volume ranges, that is, excluding lesions with smaller volumes in order to assess the effect of tumor size on the results. To exemplify our procedure, the selected features were used to predict 1-yr survival in a dataset of 150 NSCLC patients. A predictive model was built using volume as predictive factor for smaller, and one of the selected features as predictive factor for bigger lesions. The prediction accuracy of the both models were compared with the prediction accuracy of volume.
Results The number of selected features depended on the lesion size included in the analysis. When including the whole dataset, from 19 features reflecting actual texture only two were found to be not strongly correlated with conventional PET metrics. When excluding lesions smaller than 11.49 and 33.10 mL (25 and 50 percentile of the dataset), four out of 27 features and 13 out of 29 features remained after eliminating features highly correlated with standard PET metrics. When excluding lesions smaller than 103.9 mL (75 percentile), 33 out of 53 features remained. For larger lesions, some of these features outperformed volume in terms of classification accuracy (increase of 4-10%). The combination of using volume as predictor for smaller and one of the selected features for larger lesions also improved the accuracy when compared with volume only (increase from 72% to 76%).
Conclusion When performing radiomic analysis for smaller lesions, it should be first carefully investigated if a textural feature reflects actual heterogeneity information. Next, verification of the absence of correlation with all conventional PET metrics is essential in order to assess the additional value of radiomic features. Radiomic analysis with lesions larger than 11.4 mL might give additional information to conventional metrics while at the same time reflecting actual tumor texture. Using a combination of volume and one of the selected features for prediction yields promise to increase accuracy and reliability of a radiomic model.
- clinical value
- feature selection