Generating Referring Expressions in a Multimodal Context

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    81 Downloads (Pure)

    Abstract

    In this paper an algorithm for the generation of referring expressions in a multimodal setting is presented. The algorithm is based on empirical studies of how humans refer to objects in a shared workspace. The main ingredients of the algorithm are the following. First, the addition of deictic pointing gestures, where the decision to point is determined by two factors: the effort of
    pointing (measured in terms of the distance to and size of the target object) as well as the effort required for a full linguistic description (measured in terms of number of required properties and relations). Second, the algorithm explicitly keeps track of the current focus of attention, in such a way that objects which are closely related to the object which was most recently referred to are
    more prominent than objects which are farther away. To decide which object are ‘closely related’ we make use of the concept of perceptual grouping. Finally, each object in the domain is assigned a three-dimensional salience weight indicating whether it is linguistically and/or inherently salient and whether it is part of the current focus of attention. The resulting algorithm is capable of
    generating a variety of referring expressions, where the kind of NP is co-determined by the accessibility of the target object (in terms of salience), the presence or absence of a relatum as well as the possible inclusion of a pointing gesture.
    Original languageEnglish
    Title of host publicationSelected Papers of the 11th CLIN Meeting
    Place of PublicationUtrecht, Netherlands
    PublisherRodopi
    Pages158-176
    Number of pages19
    Publication statusPublished - 2001

    Cite this