NAVE
Networked Augmented Virtual Environment (NAVE) Group
Publication:Yuan Xiong, JingruWang, Zhong Zhou. VirtualLoc: Large-Scale Visual Localization using VirtualImages[J]. ACM Transactions on Multimedia Computing Communications and Applications (TOMM). 2023, 20(3): 1-19. (CCF-B Journal) pdf
 
      Robust and accurate camera pose estimation is fundamental in computer vision. Learning-based regression approaches acquire 6 degree-of-freedom (DoF) camera parameters accurately from visual cues of an input image. However, most are trained on street-view and landmark datasets. These approaches can hardly be generalized to overlooking use cases, such as the calibration of the surveillance camera and unmanned aerial vehicle (UAV). Besides, reference images captured from the real world are rare and expensive, and their diversity is not guaranteed. In this paper, we address the problem of using alternative virtual images for visual localization training. This work has the following principle contributions: First, we present a new challenging localization dataset containing 6 reconstructed large-scale 3D scenes, 10594 calibrated photographs with condition changes, and 300k virtual images with pixel-wise labeled depth, relative surface normal, and semantic segmentation. Second, we present a flexible multi-feature fusion network trained on virtual image datasets for robust image retrieval. Third, we propose an end-to-end confidence map prediction network for feature filtering and pose estimation. We demonstrate that large-scale rendered virtual images are beneficial to visual localization. Using virtual images can solve the diversity problem of real images and leverage labeled multi-feature data for deep learning. Experimental results show that our method achieves remarkable performance surpassing state-of-the-art approaches. To foster research on improvement for visual localization using synthetic images, we release our benchmark at https://github.com/YuanXiong/contributions.
create by admin at 2023-10-11 22:53:02