EasyInsert is a highly efficient and generalizable robotic insertion framework. By leveraging just one hour of human teleoperation data to bootstrap a large-scale automated data collection process across five categories of insertion pairs, EasyInsert demonstrates strong zero-shot generalization to diverse, unseen objects. It remains robust even in densely cluttered environments and under significant initial pose offsets.
Overview of our method: (1) Left: Data collection module scaling data via 80\% automated spatial exploration and 20\% manual fine-grained interaction. (2) Middle: Generalist policy predicting relative pose directly from vision , supporting fast adaptation for novel test objects when requiring high precision. (3) Right: Coarse-to-fine execution motivated by human insertion behavior.
We trained EasyInsert on five categories of objects and evaluated it on 15 unseen insertion tasks. The robotic system consists of a 7-DoF Franka Emika Panda arm, using dual wrist-mounted Intel Realsense 405 RGB cameras for visual perception. Experimental results demonstrate that EasyInsert exhibits strong generalization capabilities across diverse objects, spatial configurations, and environmental conditions, while maintaining high resistance to perturbations.
AutoMate-01129
AutoMate-00417
AutoMate-00320
AutoMate-01041
AutoMate-00681
Key
HDMI
Type-C
Ethernet
Doughnut
Trapezoid
Rectangle
Rectangle-Thin
Stick
Round-1
Round-2
Type-C
Rectangle
Ethernet
AutoMate-00681
Type-C
Rectangle
HDMI
AutoMate-00681
Here, we present an unedited video in which EasyInsert successfully performs consecutive object insertions in a zero-shot setting under human perturbation. This demonstrates EasyInsert's strong generalization capability and robustness against external disturbances.
@article{li2025easyinsert,
title={EasyInsert: A Data-Efficient and Generalizable Insertion Policy},
author={Li, Guanghe and Zhao, Junming and Wang, Shengjie and Gao, Yang},
journal={arXiv preprint arXiv:2505.16187},
year={2025}
}