Scene-unified image translation for visual localization

Published:

Abstract

Visual localization is a key technology in the field of 3D robot vision. One of its major difficulties lies in how to deal with the challenges brought by the appearance changes of query images and database images caused by large time spans. Many methods focus on extracting more robust features from images to deal with the impact of complex scenes. In this paper, we explore the impact of image translation on visual localization tasks in complex scenes. We propose UniGAN - a modified image translation model, fusing semantic label constraints and finer reconstruction losses, to unify images captured under different environmental conditions to a standard scene more suitable for localization tasks. To estimate the 6-DOF camera pose, a two-stage localization framework composed of image retrieval and local matching is utilized. Experiments show that our method outperforms the state-of-the-art in terms of both accuracy and robustness to environmentally sensitive scenes.

Reference

Sheng Han, Wei Gao*, Yiming Wan, Yihong Wu. Scene-unified image translation for visual localization. Abu Dhabi National Exhibition Center (ADNEC), Abu Dhabi, United Arab Emirates (UAE), 10.25-10.28, 2266-2270, ICIP 2020. DOI