Embodied Visuomotor Representation

Levi Burner, Cornelia Fermüller, Yiannis Aloimonos

Published: 2024/9/30

Abstract

Imagine sitting at your desk, looking at objects on it. You do not know their exact distances from your eye in meters, but you can immediately reach out and touch them. Instead of an externally defined unit, your sense of distance is tied to your action's embodiment. In contrast, conventional robotics relies on precise calibration to external units, with which vision and control processes communicate. We introduce Embodied Visuomotor Representation, a methodology for inferring distance in a unit implied by action. With it a robot without knowledge of its size, environmental scale, or strength can quickly learn to touch and clear obstacles within seconds of operation. Likewise, in simulation, an agent without knowledge of its mass or strength can successfully jump across a gap of unknown size after a few test oscillations. These behaviors mirror natural strategies observed in bees and gerbils, which also lack calibration in an external unit.

Embodied Visuomotor Representation | SummarXiv | SummarXiv