GIMO Project Page

ECCV 2022

GIMO: Gaze-Informed Human Motion Prediction in Context

Yang Zheng^1,2, Yanchao Yang¹, Kaichun Mo¹, Jiaman Li¹, Tao Yu², Yebin Liu², Karen Liu¹, Leonidas J. Guibas¹

¹Stanford University ²Tsinghua University

Fig 1. We collect human motion data and the corresponding ego-centric videos with eye gaze information in various indoor environments.

Abstract

Predicting human motion is critical for assistive robots and AR/VR applications, where the interaction with humans needs to be safe and comfortable. Meanwhile, an accurate prediction depends on understanding both the scene context and human intentions. Even though many works study scene-aware human motion prediction, the latter is largely underexplored due to the lack of ego-centric views that disclose human intent and the limited diversity in motion and scenes. To reduce the gap, we propose a large-scale human motion dataset that delivers high-quality body pose sequences, scene scans, as well as ego-centric views with eye gaze that serves as a surrogate for inferring human intent. By employing inertial sensors for motion capture, our data collection is not tied to specific scenes, which further boosts the motion dynamics observed from our subjects. We perform an extensive study of the benefits of leveraging eye gaze for ego-centric human motion prediction with various state-of-the-art architectures. Moreover, to realize the full potential of gaze, we propose a novel network architecture that enables bidirectional communication between the gaze and motion branches. Our network achieves the top performance in human motion prediction on the proposed dataset, thanks to the intent information from the gaze and the denoised gaze feature modulated by the motion. The proposed dataset and our network implementation will be publicly available.

[arXiv] [Code & Data]

Overview

Fig 2. Human motion driven by different intents look similar at the beginning. However, the scanning patterns of the eye gaze (red dots) during the starting phase are pretty distinctive, which suggests that we can leverage eye gaze to reduce uncertainties when predicting future body movements.

Fig 3. Pipeline of our method for gaze-informed human motion prediction.

Dataset

Fig 4. Recruited subjects collecting data in various scenes.

Fig 5. A demo of our dataset.

Motion Prediction Results

Fig 5. Results of human motion prediction.

Demo Video

Citation

Yang Zheng, Yanchao Yang, Kaichun Mo, Jiaman Li, Tao Yu, Yebin Liu, Karen Liu, Leonidas J. Guibas. "GIMO: Gaze-Informed Human Motion Prediction in Context". ECCV 2022

@inproceedings{zheng2022gimo,
title={GIMO: Gaze-Informed Human Motion Prediction in Context},
author={Zheng, Yang and Yang, Yanchao and Mo, Kaichun and Li, Jiaman and Yu, Tao and Liu, Yebin and Liu, Karen and Guibas, Leonidas},
booktitle={ECCV},
year={2022},
}