Abstract
In this paper, we examine a set of object interactions generated with a 3D natural language simulation and visualization platform, VoxSim (Krishnaswamy and Pustejovsky 2016b). These simulations all realize the natural language relations “touching” and “near” over a test set of various objects within a 3-dimensional world that interprets descriptions of motion events and renders their visual instantiations from the perspective of an embodied virtual agent. These object interactions were evaluated by human judges using Amazon Mechanical Turk and we examine some of the qualitative interpretations provided by humans over these computergenerated interpretations of underspecified relations, conditioned on the frame of reference (agent’s point of view) and object position relative to that point of view (POV). Through analysis of the human evaluations, we find that average evaluator satisfaction with many specifications for these relations appears to strongly depend on the relationship between the two objects and between the objects and the POV.