Car-studio: Learning Car Radiance Fields from Single-View and Endless In-the-wild Images

1 The Hong Kong University of Science and Technology,
2 Institute for AI Industry Research, Tsinghua University,
3 The Hong Kong University of Science and Technology(Guangzhou)
Submissions to the IEEE Robotics and Automation Letters (RA-L)

*Indicates Corresponding Author

Abstract

Compositional neural scene graph studies have shown that radiance fields can be an efficient tool in an editable autonomous driving simulator. However, previous studies learned within a sequence of autonomous driving datasets, resulting in unsatisfactory blurring when rotating the car in the simulator. In our work, we propose a pipeline for learning unconstrained images and building a dataset from processed images. To meet the requirements of the simulator, which demands that the vehicle maintain clarity when the perspective changes and that the contour remains sharp from the background to avoid artifacts when editing, we design a radiation field of the vehicle, a crucial part of the urban scene foreground. Through experiments, we demonstrate that our model achieves competitive performance compared to baselines. Using the datasets built from in-the-wild images, our method gradually presents a controllable appearance editing function. We will release the dataset and code here to facilitate further research in the field.

Highlights

  • A curated dataset CarPatch3D of hundreds of thousands of 2D images and 3D spatial information. This dataset provides favorable information for training a category-based NeRF model for cars. Its availability enables the development of more effective urban NeRF foreground models.
  • We developed Car-NeRF, which conforms to the characteristics of autonomous driving environments and achieves state-of-the-art performance in image reconstruction and novel view synthesis tasks.
  • We designed a pipeline called Car-Studio that can learn from single-view in-the-wild car images to generate 3D surrounding views and enable plausible controllable spatial and appearance editing.

A glance at CarPatch3D

JSON description file segment for the CarPatch3D dataset


      {//path to the car patch file
      "patch": "patch/km_0400000002_patch.png",
      //path to the mask file
      "mask": "mask/km_0400000002_mask.png", 		
      //path to the scene image file			                              
      "image_file": "../kitti-mot/training/image_02/0004/000000.png",
      //camera intrinsics
      "fl_x": 721.5377, "fl_y": 721.5377, "cx": 609.5593, "cy": 172.854, 		        
      //camera extrinsics
      "cam_tx": 44.85728, "cam_ty": 0.2163791, "cam_tz": 0.002745884, 	
      //2D region of interest	          
      "xmin": 805.735819, "xmax": 960.597684, "ymin": 161.72116, "ymax": 251.71257, 
      //3D dimensions of the car
      "height": 1.649293, "width": 1.669751, "length": 3.639134, 	
      //car pose		              
      "obj_x": 5.751944, "obj_y": 1.457555, "obj_z": 15.096122, "yaw": -0.788125, 	
      //shape of the scene image
      "w": 1242, "h": 375								                                             
      }
    

Car-NeRF architecture

main_graph

Car-Studio manipulation showcases

input_and_output

Our approach utilizes zero-shot learning to reconstruct the foreground radiance field of a scene containing multiple cars from a single image (top right). Test-time optimization improves image quality (bottom left). Final rendered output (bottom right = top right + bottom left).

rotation

Foreground car editing in an autonomous driving simulator using zero-shot learning. Our method learns priors, providing superior performance to neural scene graphs [Julian Ost et.al. CVPR 21], as shown in the top comparison.

transformed_insertion

Manipulation of a car instance to demonstrate our proposed approach's editing ability for an autonomous driving simulator.

transformed_insertion

Controllable color editing of car instances using a mask applied to the original image.

BibTeX

@misc{liu2023carstudio,
      title={Car-Studio: Learning Car Radiance Fields from Single-View and Endless In-the-wild Images}, 
      author={Tianyu Liu and Hao Zhao and Yang Yu and Guyue Zhou and Ming Liu},
      year={2023},
      eprint={2307.14009},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}