Publication Detail

NAVE	Networked Augmented Virtual Environment (NAVE) Group


Publication:Jingyi Nie, Liangliang Cai, Qichuan Geng, Zhong Zhou. Understanding Matters: Semantic-Structural Guided Visual Relocalization for Large Scenes. Proceedings of the 34th International Joint Conference on Artificial Intelligence (IJCAI), 2025. pdf

Scene Coordinate Regression (SCR) estimates 3D scene coordinates from 2D images, and has become an important approach in visual relocalization. Existing methods exhibit high localization accuracy in small scenes, but still face substantial challenges in large-scale scenes that usually have significant variations in depth, scale, and occlusion. Structure-guided scene partitioning has been commonly adopted. However, the over-partitioned elements and large feature variances within subscenes impede the estimation of the 3D coordinates, introducing misleading information for subsequent processing. To address the above-mentioned issues, we propose Semantic-Structural Determined Visual Relocalization, which leverages semantic-structural partition learning and partition-determined pose refinement. Firstly, we partition the scene into small subscenes with label assignments, ensuring semantic consistency and structural continuity within each subscene. A classifier is then trained with sampling-based learning to predict these labels. Secondly, the partition predictions are encoded into embeddings and integrated with local features for intra-class compactness and inter-class separation, producing partition-aware features. To further decrease feature variances, we employ a discriminability metric and suppress ambiguous points, improving subsequent computations. Experimental results on the Cambridge Landmarks dataset demonstrate that the proposed method achieves significant improvements with fewer training costs in large-scale scenes, reducing the median error by 38\% compared to the state-of-the-art SCR method DSAC*.
create by admin at 2025-06-26 15:35:46