HyperAIHyperAI

Command Palette

Search for a command to run...

Mean Speed Strategy (MVP)

Date

12 hours ago

Organization

The University of Hong Kong
Tsinghua University
University of California

The Mean Velocity Policy (MVP) was jointly proposed by research teams from Tsinghua University (School of Vehicle and Transportation and School of Artificial Intelligence), the BAIR (Baidu Research Laboratory for Artificial Intelligence) at the University of California, Berkeley, and the University of Hong Kong. This work was formally published as a conference paper at the International Conference on Learning Representations (ICLR 2026) in 2026. Related research results were published in the paper "Mean Flow Policy with Instantaneous Velocity Constraint for One-step Action Generation".

MVP is a novel generative policy for reinforcement learning that achieves the fastest single-step action generation by modeling an "average velocity field," completely eliminating the computational overhead of multi-step sampling. To address the challenge of lacking explicit boundary conditions in the model, the research team introduced "instantaneous velocity constraints (IVC)," effectively improving learning accuracy and policy expressiveness. In practical performance, MVP significantly improves training and inference speed (average single-step inference time is only 10.93 milliseconds) and achieves a state-of-the-art average success rate of 0.88 on complex robot manipulation tasks in Robomimic and OGBench, reaching the state-of-the-art in this field.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing

HyperAI Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp