Abstract
Existing datasets for causality identification in argumentative texts have several limitations, such as the type of input text (e.g., only claims), causality type (e.g., only positive), and the linguistic patterns investigated (e.g., only verb connectives). To resolve these limitations, we build the Webis-Causality-23 dataset, with sophisticated inputs (all units from arguments), a balanced distribution of causality types, and a larger number of linguistic patterns denoting causality. The dataset contains 1485 examples derived by combining the two paradigms of distant supervision and uncertainty sampling to identify diverse, high-quality samples of causality relations, and annotate them in a cost-effective manner.
Original language | English |
---|---|
Title of host publication | Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue |
Editors | Svetlana Stoyanchev, Shafiq Joty, David Schlangen, Ondrej Dusek, Casey Kennington, Malihe Alikhani |
Publisher | Association for Computational Linguistics, ACL Anthology |
Pages | 349-354 |
Number of pages | 6 |
DOIs | |
Publication status | Published - 1-Sept-2023 |
Event | 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue - Prague, Czech Republic Duration: 11-Sept-2023 → 15-Sept-2023 |
Conference
Conference | 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue |
---|---|
Country/Territory | Czech Republic |
City | Prague |
Period | 11/09/2023 → 15/09/2023 |