TY - JOUR
T1 - Investigating interoperable event corpora
T2 - limitations of reusability of resources and portability of models
AU - Caselli, Tommaso
AU - Bos, Johan
N1 - Publisher Copyright:
© 2023, The Author(s).
PY - 2023/9
Y1 - 2023/9
N2 - Studies on the applicability of heterogeneous semantically interoperable corpora are rare. We investigate to what extent reusability (both of systems and of annotations) is entailed by corpora whose interoperability is based on compliance to standards. In particular, we look at event detection in English texts, supported by the ISO-TimeML annotation scheme. We run two sets of experiments using a common neural network architecture and extensively evaluate our results on both in-distribution and out-of-distribution settings. In all experimental settings, systems obtain state-of-the-art results on the in-distribution data and underperform out-of-distribution ones, setting limits to the benefits of semantically interoperable corpora. By means of a detailed error analysis, we show that while being compliant to a standard guarantees semantic interoperability, this becomes only a necessary condition for reusability, with factors such as differences in the quality of the annotations having a much stronger impact.
AB - Studies on the applicability of heterogeneous semantically interoperable corpora are rare. We investigate to what extent reusability (both of systems and of annotations) is entailed by corpora whose interoperability is based on compliance to standards. In particular, we look at event detection in English texts, supported by the ISO-TimeML annotation scheme. We run two sets of experiments using a common neural network architecture and extensively evaluate our results on both in-distribution and out-of-distribution settings. In all experimental settings, systems obtain state-of-the-art results on the in-distribution data and underperform out-of-distribution ones, setting limits to the benefits of semantically interoperable corpora. By means of a detailed error analysis, we show that while being compliant to a standard guarantees semantic interoperability, this becomes only a necessary condition for reusability, with factors such as differences in the quality of the annotations having a much stronger impact.
KW - Event detection
KW - Portability of systems
KW - Reusability of data
KW - Semantic interoperability
KW - Standards
UR - http://www.scopus.com/inward/record.url?scp=85148944206&partnerID=8YFLogxK
U2 - 10.1007/s10579-023-09643-6
DO - 10.1007/s10579-023-09643-6
M3 - Article
AN - SCOPUS:85148944206
SN - 1574-020X
VL - 57
SP - 1107
EP - 1137
JO - Language Resources and Evaluation
JF - Language Resources and Evaluation
ER -