The reliability of single task assessment in longitudinal L2 writing research

May Y. Wu*, Rasmus Steinkrauss, Wander Lowie

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

4 Citations (Scopus)
156 Downloads (Pure)

Abstract

Single task writing assessments used in longitudinal studies have raised concerns regarding their reliability. By means of Generalizability Theory (GT), this study investigated the reliability of L2 writing assessments scored on different CAF measures, focusing on a) the reliability of single task writing assessments and on the effects of b) task topics and c) task-taking occasions on assessment reliability. We investigated analytic quantitative scores obtained from five CAF measures through a 1-day dataset and a 21-day dataset, consisting of 90 essays from 18 Chinese learners of English who did not follow any formal language instruction during the investigation. The results show that although some CAF scores (e.g., fluency) of single task assessments have distinctly higher reliability than other scores, the general conclusion is that single task assessments are not reliable from a GT perspective. Task topic introduces some score variance to the assessment result, yet this amount of variance differs profoundly between the CAF measures due to the functional variability, which corresponds with Complex Dynamic Systems Theory assumptions suggesting subsystems of an L2 do not develop synchronously. Finally, occasion, i.e., whether two samples were written on the same day or within 21 days, barely introduces score variance.
Original languageEnglish
Article number100950
Number of pages13
JournalJournal of Second Language Writing
Volume59
Early online date5-Dec-2022
DOIs
Publication statusPublished - Mar-2023

Fingerprint

Dive into the research topics of 'The reliability of single task assessment in longitudinal L2 writing research'. Together they form a unique fingerprint.

Cite this