TY - GEN
T1 - Synonym-Based Essay Generation and Augmentation for Robust Automatic Essay Scoring
AU - Tashu, Tsegaye Misikir
AU - Horváth, Tomáš
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2022
Y1 - 2022
N2 - Automatic essay scoring (AES) models based on neural networks (NN) have had a lot of success. However, research has shown that NN-based AES models have robustness issues, such that the output of a model changes easily with small changes in the input. We proposed to use keyword-based lexical substitution using BERT that generates new essays (adversarial samples) which are lexically similar to the original essay to evaluate the robustness of AES models trained on the original set. In order to evaluate the proposed approach, we implemented three NN-based scoring approaches and trained the scoring models using two stages. First, we trained each model using the original data and evaluate the performance using the original test and newly generated test set to see the impact of the adversarial sample of the model. Secondly, we trained the models by augmenting the generated adversarial essay with the original data to train a robust model against synonym-based adversarial attacks. The results of our experiments showed that extracting the most important words from the essay and replacing them with lexically similar words, as well as generating adversarial samples for augmentation, can significantly improve the generalization of NN-based AES models. Our experiments also demonstrated that the proposed defense is capable of not only defending against adversarial attacks, but also of improving the performance of NN-based AES models.
AB - Automatic essay scoring (AES) models based on neural networks (NN) have had a lot of success. However, research has shown that NN-based AES models have robustness issues, such that the output of a model changes easily with small changes in the input. We proposed to use keyword-based lexical substitution using BERT that generates new essays (adversarial samples) which are lexically similar to the original essay to evaluate the robustness of AES models trained on the original set. In order to evaluate the proposed approach, we implemented three NN-based scoring approaches and trained the scoring models using two stages. First, we trained each model using the original data and evaluate the performance using the original test and newly generated test set to see the impact of the adversarial sample of the model. Secondly, we trained the models by augmenting the generated adversarial essay with the original data to train a robust model against synonym-based adversarial attacks. The results of our experiments showed that extracting the most important words from the essay and replacing them with lexically similar words, as well as generating adversarial samples for augmentation, can significantly improve the generalization of NN-based AES models. Our experiments also demonstrated that the proposed defense is capable of not only defending against adversarial attacks, but also of improving the performance of NN-based AES models.
KW - Adversarial attack
KW - Automatic essay scoring
KW - Data augmentation
UR - http://www.scopus.com/inward/record.url?scp=85144815181&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-21753-1_2
DO - 10.1007/978-3-031-21753-1_2
M3 - Conference contribution
AN - SCOPUS:85144815181
SN - 9783031217524
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 12
EP - 21
BT - Intelligent Data Engineering and Automated Learning – IDEAL 2022 - 23rd International Conference, IDEAL 2022, Proceedings
A2 - Yin, Hujun
A2 - Camacho, David
A2 - Tino, Peter
PB - Springer Science and Business Media Deutschland GmbH
T2 - 23rd International Conference on Intelligent Data Engineering and Automated Learning, IDEAL 2022
Y2 - 24 November 2022 through 26 November 2022
ER -