TY - JOUR
T1 - Playing to distraction
T2 - towards a robust training of CNN classifiers through visual explanation techniques
AU - Morales, David
AU - Talavera, Estefania
AU - Remeseiro, Beatriz
N1 - Funding Information:
Partial financial support was received from HAT.tec GmbH. This work has been financially supported in part by European Union ERDF funds, by the Spanish Ministry of Science and Innovation (research project PID2019-109238GB-C21), and by the Principado de Asturias Regional Government (research project IDI-2018-000176). The funders had no role in the study design, data collection, analysis, and preparation of the manuscript.
Funding Information:
We would like to thank the Center for Information Technology of the University of Groningen for their support and for providing access to the Peregrine high performance computing cluster.
Publisher Copyright:
© 2021, The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature.
PY - 2021/12
Y1 - 2021/12
N2 - The field of deep learning is evolving in different directions, with still the need for more efficient training strategies. In this work, we present a novel and robust training scheme that integrates visual explanation techniques in the learning process. Unlike the attention mechanisms that focus on the relevant parts of images, we aim to improve the robustness of the model by making it pay attention to other regions as well. Broadly speaking, the idea is to distract the classifier in the learning process by forcing it to focus not only on relevant regions but also on those that, a priori, are not so informative for the discrimination of the class. We tested the proposed approach by embedding it into the learning process of a convolutional neural network for the analysis and classification of two well-known datasets, namely Stanford cars and FGVC-Aircraft. Furthermore, we evaluated our model on a real-case scenario for the classification of egocentric images, allowing us to obtain relevant information about peoples’ lifestyles. In particular, we work on the challenging EgoFoodPlaces dataset, achieving state-of-the-art results with a lower level of complexity. The results obtained indicate the suitability of our proposed training scheme for image classification, improving the robustness of the final model.
AB - The field of deep learning is evolving in different directions, with still the need for more efficient training strategies. In this work, we present a novel and robust training scheme that integrates visual explanation techniques in the learning process. Unlike the attention mechanisms that focus on the relevant parts of images, we aim to improve the robustness of the model by making it pay attention to other regions as well. Broadly speaking, the idea is to distract the classifier in the learning process by forcing it to focus not only on relevant regions but also on those that, a priori, are not so informative for the discrimination of the class. We tested the proposed approach by embedding it into the learning process of a convolutional neural network for the analysis and classification of two well-known datasets, namely Stanford cars and FGVC-Aircraft. Furthermore, we evaluated our model on a real-case scenario for the classification of egocentric images, allowing us to obtain relevant information about peoples’ lifestyles. In particular, we work on the challenging EgoFoodPlaces dataset, achieving state-of-the-art results with a lower level of complexity. The results obtained indicate the suitability of our proposed training scheme for image classification, improving the robustness of the final model.
KW - Convolutional neural networks
KW - Egocentric vision
KW - Fine-grained recognition
KW - Image classification
KW - Learning process
KW - Visual explanation techniques
UR - http://www.scopus.com/inward/record.url?scp=85110281253&partnerID=8YFLogxK
U2 - 10.1007/s00521-021-06282-2
DO - 10.1007/s00521-021-06282-2
M3 - Article
AN - SCOPUS:85110281253
SN - 0941-0643
VL - 33
SP - 16937
EP - 16949
JO - Neural Computing and Applications
JF - Neural Computing and Applications
IS - 24
ER -