GOTPR: General Outdoor Text-based Place Recognition Using Scene Graph Retrieval with OpenStreetMap

Abstract

We propose GOTPR, a robust place recognition method designed for outdoor environments where GPS signals are unavailable. Unlike existing approaches that use point cloud maps, which are large and difficult to store, GOTPR leverages scene graphs generated from text descriptions and maps for place recognition. This method improves scalability by replacing point clouds with compact data structures, allowing robots to efficiently store and utilize extensive map data. Additionally, GOTPR eliminates the need for custom map creation by using publicly available OpenStreetMap data, which provides global spatial information. We evaluated its performance using the KITTI360Pose dataset with corresponding OpenStreetMap data, comparing it to existing point cloud-based place recognition methods. The results show that GOTPR achieves comparable ac- curacy while significantly reducing storage requirements. In city- scale tests, it completed processing within a few seconds, making it highly practical for real-world robotics applications.

Related works

Process

Process of GOTPR. The method consists of three sequential steps: 1) Scene graph generation, 2) Scene graph candidates extraction, 3) Scene graph retrieval and scene selection. The input data are a query text description and OSM. The output is the matching scene id.

Scene graph generation

OSM scene graph generation

Text scene graph generation

Scene graph retrieval network

Network of scene graph retrieval. Input of the process are query text scene graph and the extracted OSM scene graph candidates. And the output is the selected top-k scene id. The joint embedding model consists of multiple GPS convolution layers with self and cross modules.

Experiments

Experimental example

Examples of the experimental data are provided. The data were generated based on GPS coordinates 48.964117, 8.472481. (a) and (b) are presented to facilitate comparison and understanding rather than actual data used in GOTPR. In addition, The segmentations in (b) and (c) are included solely for visualization purposes to aid understanding. Moreover, the gray area in (f) represents the overlapping region with (e).

City-scale data generation (GPS sampling)

GPS samples collected in Toronto (⬤: Included GPS, ⬤: Excluded GPS)

A quantitative evaluation with baseline comparisons and ablation study

A quantitative evaluation for candidates extraction

A quantitative evaluation for city-scale data

A quantitative evaluation of processing time for rule-based and transformer-based methods

A quantitative evaluation of the map data size in point clouds and scene graphs

Qualitative results

(a) Street-view image (360 degree).

(b) OSM image (used for directional reference).

(c) Top-1 OSM (G.T.).

(d) Top-1 OSM scene graph (G.T.).

(e) Top-3 OSM.

(f) Top-3 OSM scene graph.

(g) Top-5 OSM.

(h) Top-5 OSM scene graph.

An example of place recognition results from GOTPR. The OSM scene graph that best matches the actual query text scene graph serves as the ground truth and corresponds to the Top-1 OSM scene graph with the highest similarity in the example above. The results were generated using the GPS coordinates 43.7594129828112,-79.46708294624517. The suffixes _n1,2 attached to some node labels in (a) are identifiers used to distinguish nodes with the same label. These identifiers are removed during the text scene graph generation process, resulting in the text scene graph shown in (b).

Experiments involving the user-generated text description

Images presented to the tester

Street-view image (360 degree).

OSM image (used for directional reference).

User intruction | User-generated text description | Prompt for LLM (user-generated text to GOTPR format) | LLM-generated text description

BibTeX


        @article{jung2025gotpr,
          title={GOTPR: General Outdoor Text-based Place Recognition Using Scene Graph Retrieval with OpenStreetMap},
          author={Jung, Donghwi and Kim, Keonwoo and Kim, Seong-Woo},
          journal={IEEE Robotics and Automation Letters},
          year={2025},
          publisher={IEEE}
        }