
Abstract
We study the problem of robotic stacking with objects of complex geometry. We propose a challenging and diverse set of such objects that was carefully designed to require strategies beyond a simple "pick-and-place" solution. Our method is a reinforcement learning (RL) approach combined with vision-based interactive policy distillation and simulation-to-reality transfer. Our learned policies can efficiently handle multiple object combinations in the real world and exhibit a large variety of stacking skills. In a large experimental study, we investigate what choices matter for learning such general vision-based agents in simulation, and what affects optimal transfer to the real robot. We then leverage data collected by such policies and improve upon them with offline RL. A video and a blog post of our work are provided as supplementary material.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| skill-generalization-on-rgb-stacking | BC - IMP | Average: 49 Group 1: 23 Group 2: 39.3 Group 3: 39.3 Group 4: 77.5 Group 5: 66 |
| skill-mastery-on-rgb-stacking | BC-IMP | Average: 74.6 Group 1: 75.6 Group 2: 60.8 Group 3: 70.8 Group 4: 87.8 Group 5: 78.3 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.