A collaboration between the USC Institute for Creative Technologies and the USC Information Sciences Institute
In 1944, Fritz Heider and Marianne Simmel published the results of their study of behavior explanation, now a classic work in the field of social psychology. Subjects were shown a short film depicting the motion of two triangles and a circle, and then asked to describe what happened. These subjects responded with creative narratives that ascribed humanlike goals, plans, beliefs, and emotions to the moving objects in an anthropomorphic manner. This study informed Heider's later formulation of attribution theory, and highlighted the central role of people's commonsense theories of psychology in explanations of behavior.
Can we build a computer that interprets and narrates the movement of abstract shapes as if they were intentional agents, the same way that was done by the subjects in Heider and Simmel's experiments?
In this project, we implement a computational model of behavior explanation using a large-scale logical formalization of commonsense psychology. This formalization, written as axioms in first-order predicate calculus, encodes the inferential relationships between commonsense concepts of goals, plans, beliefs, emotions, predications, explanations, decisions, and memories, among hundreds of others. Rather than using traditional theorem proving methods, we operationalize the behavior explanation process as a search for the most probable set of assumptions that logically entail the observation.
Our approach is demonstrated using videos created using an innovative new application, the Heider-Simmel Interactive Theater. Users of this application are able to create their own short movies in the style of Heider and Simmel's original film, where the behavior of triangles and circles are recorded by dragging them around the screen using a multi-touch interface. Using a combination of sketch understanding and gesture recognition techniques, we have implemented a mid-level visual perception system that interprets the position and trajectories of these objects as behavioral observations. Probabilistic logical abduction is then used to explain these observations by ascribing mental states to the objects, using a large-scale logical formalization of commonsense psychology. The output of this reasoning, the most probable set of assumptions that entail the observations, is then automatically rewritten as an English-language narrative of the events in the user's movie using data-driven natural language generation techniques. We evaluated our approach by comparing these natural language outputs to the users' own textual narrations of the events in their films, gathered from the public at large through a project website.
This research was being conducted by Andrew S. Gordon (PI), Jerry Hobbs, and Fabrizio Morbini. Previous contributors include Melissa Roemmele, Louis-Philippe Morency, Katya Ovchinnikova, Soja-Marie Morgens, Emily Ahn, Nicole Maslan, and Haley Archer-McClellan.
This research was possible with funding from the Office of Naval Research (ONR) under the Cognitive Science program (Dr. Paul Bello and Dr. Micah Clark, program managers), grant number N00014-13-1-0286, "Heider-Simmel Interactive Theater," 1/1/2013 to 12/31/2015.
Andrew S. Gordon
Institute for Creative Technologies
University of Southern California
12015 Waterfront Drive
Los Angeles, CA 90094-2536
Phone: (310) 574-5700
Fax: (310) 574-5725