DayDreamer: World model for learning robot physics

news7g06/30/2022

77 2 minutes read

The iCub robot. Image credit: European Union 2012 EP/Pietro Naj-Oleari via Flickr, CC BY-NC-ND 2.0

Teaching robot Solving complex real-world tasks is a fundamental problem in robotics. A common approach is deep reinforcement learning, but it is often impractical for real-world tasks.

iCub Robot. Image credit: European Union 2012 EP / Pietro Naj-Oleari via FlickrCC BY-NC-ND 2.0

World models are a data-efficient alternative. Learning from past experiences allows the robot to visualize the results of potential actions and reduces the amount of trial and error. A recent article on arXiv.org uses a Dreamer world model to train a variety of real-world robots.

The researchers demonstrate successful learning directly in challenges such as different action spaces, sensory modalities, and reward structures. A quad bike is taught from scratch to roll off the back, stand up and walk for 1 hour. Robotic arms that learn to select and place objects from sparse rewards perform better than agents without models. The software infrastructure is publicly available, providing a flexible platform for studying future world models for robotics.

To tackle tasks in complex environments, robots need to learn from experience. Deep reinforcement learning is a popular approach to learning with robots but requires a large amount of trial and error to learn, limiting its implementation in the physical world. As a result, many advances in robot learning rely on simulators. On the other hand, the simulator’s internal learning does not capture the complexity of the real world, is prone to simulation bias, and the resulting behaviors do not adapt to world changes. The Dreamer algorithm has recently shown great promise for learning from small amounts of interaction by planning in a learned world model, outperforming pure reinforcement learning in video games. Learning a model of the world to predict the outcome of potential actions allows for imaginative planning, reducing the amount of trial and error required in a real-world environment. However, it is not yet known whether Dreamer can facilitate faster learning on physical robots. In this paper, we apply Dreamer to 4 robots to learn online and face-to-face in the real world, no simulator needed. Dreamer trains a four-legged robot that can roll off its back, stand up, and start walking again without needing to reset in just 1 hour. We then pushed the robot and found that the Dreamer adapted within 10 minutes to withstand turbulence or quickly roll over and come to a stop. On two different robotic arms, Dreamer learns to select and place multiple objects directly from the camera image and sparse rewards, approaching human performance. On a wheeled robot, Dreamer learns to navigate to a target location entirely from camera images, automatically resolving the robot’s directional ambiguity. Using the same hyperparameter across all tests, we found that Dreamer was capable of real-world online learning, establishing a solid baseline. We release our infrastructure for future applications of the world model to robotic learning.

Research articles: Wu, P., Escontrela, A., Hafner, D., Goldberg, K., and Abbeel, P., “DayDreamer: World Models for Learning Physical Robotics”, 2022. Link: https://arxiv.org/abs/2206.14176

Source link

news7g06/30/2022

77 2 minutes read