Learning to Recover: Dynamic Reward Shaping with Wheel-Leg Coordination for Fallen Robots

Boyuan Deng*, Luca Rossini, Jin Wang, Weijie Wang, Nikolaos Tsagarakis
Istituto Italiano di Tecnologia

Learning to recover with wheel-leg coordination

Abstract

This paper presents a learning-based framework integrating Episode-based Dynamic Reward Shaping and curriculum learning, which dynamically balances exploration of diverse recovery maneuvers with precise posture refinement. We further demonstrate that synergistic wheel-leg coordination reduces joint torque consumption by 15.85%–26.2% and improves stabilization through energy transfer mechanisms. Extensive evaluations on two distinct platforms achieve recovery success rates up to 99.1% and 97.8% without platform-specific tuning.

Framework and Fall Simulation

The strategy can learn effective recovery actions and demonstrates remarkable robustness.

Massive testing
Wheel-Leg coordination

Skill I
Skill II
Skill III

DS-policy vs Baseline

We show that our DS-policy can recover from the falling postures of the baseline.

DS-policy
Baseline

Wheel vs Leg-driven Recovery

We show that our policy performs on the same robot across different driving models.

Wheel
Leg-driven

Cross-Platform Validation

We show that our policy performs on the Unitree Go2-W robot.

Wheel
Leg-driven
Wheel assisted
Wheel assisted

Diverse Terrains Depolyment

We show that our policy performs on the non-flat environments.

Box Grid
Pyramid Slope
Random Rough
Pyramid Stairs
Inverted Pyramid Stairs

Bibtex




  

Acknowledgements:
We would like to thank Andrea Patrizi, Despoina Maligianni, Rui Dai, Yifang Zhang Maolin Lei, Jingcheng Jiang, Kuanqi Cai, and Carlo Rizzardo for their discussions.

Template for this Website