Papers
arxiv:2503.10410

RoCo-Sim: Enhancing Roadside Collaborative Perception through Foreground Simulation

Published on Mar 13
· Submitted by yuwendu on Mar 19

Abstract

Roadside Collaborative Perception refers to a system where multiple roadside units collaborate to pool their perceptual data, assisting vehicles in enhancing their environmental awareness. Existing roadside perception methods concentrate on model design but overlook data issues like calibration errors, sparse information, and multi-view consistency, leading to poor performance on recent published datasets. To significantly enhance roadside collaborative perception and address critical data issues, we present the first simulation framework RoCo-Sim for road-side collaborative perception. RoCo-Sim is capable of generating diverse, multi-view consistent simulated roadside data through dynamic foreground editing and full-scene style transfer of a single image. RoCo-Sim consists of four components: (1) Camera Extrinsic Optimization ensures accurate 3D to 2D projection for roadside cameras; (2) A novel Multi-View Occlusion-Aware Sampler (MOAS) determines the placement of diverse digital assets within 3D space; (3) DepthSAM innovatively models foreground-background relationships from single-frame fixed-view images, ensuring multi-view consistency of foreground; and (4) Scalable Post-Processing Toolkit generates more realistic and enriched scenes through style transfer and other enhancements. RoCo-Sim significantly improves roadside 3D object detection, outperforming SOTA methods by 83.74 on Rcooper-Intersection and 83.12 on TUMTraf-V2X for AP70. RoCo-Sim fills a critical gap in roadside perception simulation. Code and pre-trained models will be released soon: https://github.com/duyuwen-duen/RoCo-Sim

Community

Paper author Paper submitter

This work unleashes the power of simulation data to enhance roadside collaborative perception, demonstrating that simulation data, rather than model architecture, is the true winner in performance competition. The proposed RoCo-Sim is the first simulation framework specifically designed for roadside collaborative perception. RoCo-Sim generates a large scale of multi-view consistent simulation data from sparse fixed viewpoints, supports both foreground editing and full-scene style transformation, and can
be rapidly deployed to any new road environment. We hope this work will significantly accelerate the practical development of roadside collaborative perception, ultimately enhancing driving safety. Code and pre-trained models will be released soon.

Paper: https://arxiv.org/pdf/2503.10410
Github: https://github.com/duyuwen-duen/RoCo-Sim

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2503.10410 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2503.10410 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2503.10410 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.