Action Categorisation in Multimodal Instructions

Ielka van der Sluis, Renate Vergeer, Gisela Redeker


    88 Downloads (Pure)


    We present an explorative study for the (semi-)automatic categorisation of actions in Dutch multimodal first aid instructions, where the actions needed to successfully execute the procedure in question are presented verbally and in pictures. We start with the categorisation of verbalised actions and expect that this will later facilitate the identification of those actions in the pictures, which is known to be hard. Comparisons of and user-based experimentation with the verbal and visual representations will allow us to determine the effectiveness of picture-text combinations and will eventually support the automatic generation of multimodal documents. We used Natural Language Processing tools to identify and categorise 2,388 verbs in a corpus of 78 multimodal instructions (MIs). We show that the main action structure of an instruction can be retrieved through verb identification using the Alpino parser followed by a manual election operation. The selected main action verbs were subsequently generalised and categorised with the use of Cornetto, a lexical resource that combines a Dutch Wordnet and a Dutch Reference Lexicon. Results show that these tools are useful but also have limitations which make human intervention essential to guide an accurate categorisation of actions in multimodal instructions.
    Originele taal-2English
    Aantal pagina's6
    StatusPublished - dec-2018
    EvenementAREA - Annotation, Recognition and Evaluation of Actions: in conjunction with the 11th edition of the Language Resources and Evaluation Conference (LREC 2018) - Phoenix Seagaia Resort, Miyazaki, Japan
    Duur: 7-mei-2018 → …


    WorkshopAREA - Annotation, Recognition and Evaluation of Actions
    Verkorte titelAREA
    Periode07/05/2018 → …
    Internet adres

    Citeer dit