Can Model Uncertainty Function as a Proxy for Multiple-Choice Question Item Difficulty?

OnderzoeksoutputAcademicpeer review

598 Downloads (Pure)

Samenvatting

Estimating the difficulty of multiple-choice questions would be great help for educators who must spend substantial time creating and piloting stimuli for their tests, and for learners who want to practice. Supervised approaches to difficulty estimation have yielded to date mixed results. In this contribution we leverage an aspect of generative large models which might be seen as a weakness when answering questions, namely their uncertainty. Specifically, we exploit model uncertainty towards exploring correlations between two different metrics of uncertainty, and the actual student response distribution. While we observe some present but weak correlations, we also discover that the models’ behaviour is different in the case of correct vs wrong answers, and that correlations differ substantially according to the different question types which are included in our fine-grained, previously unused dataset of 451 questions from a Biopsychology course. In discussing our findings, we also suggest potential avenues to further leverage model uncertainty as an additional proxy for item difficulty.
Originele taal-2English
TitelProceedings of the 31st International Conference on Computational Linguistics
RedacteurenOwen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Plaats van productieAbu Dhabi, UAE
UitgeverijAssociation for Computational Linguistics (ACL)
Pagina's11304-11316
Aantal pagina's13
ISBN van elektronische versie9798891761964
StatusPublished - jan.-2025
Evenement31st International Conference on Computational Linguistics, COLING 2025 - Abu Dhabi, United Arab Emirates
Duur: 19-jan.-202524-jan.-2025

Publicatie series

NaamProceedings - International Conference on Computational Linguistics, COLING
VolumePart F206484-1
ISSN van geprinte versie2951-2093

Conference

Conference31st International Conference on Computational Linguistics, COLING 2025
Land/RegioUnited Arab Emirates
StadAbu Dhabi
Periode19/01/202524/01/2025

Vingerafdruk

Duik in de onderzoeksthema's van 'Can Model Uncertainty Function as a Proxy for Multiple-Choice Question Item Difficulty?'. Samen vormen ze een unieke vingerafdruk.

Citeer dit