Symmetric Network with Spatial Relationship Modeling for Natural Language-based Vehicle Retrieval

CVPR Workshop 2022 AI-City Challenge

Posted by Haobo Chen on April 25, 2022

Symmetric Network with Spatial Relationship Modeling for Natural Language-based Vehicle Retrieval

Chuyang Zhao*, Haobo Chen*, Wenyuan Zhang, Junru Chen, Sipeng Zhang, Yadong Li, and Boxun Li.

🏆 The 1st Place Solution for AICity2022 Challenge Track2: Natural Language-Based Vehicle Retrieval.

[paper] [slides] [github]



Natural language (NL) based vehicle retrieval aims to search specific vehicle given text description. Different from the image-based vehicle retrieval, NL-based vehicle retrieval requires considering not only vehicle appearance, but also surrounding environment and temporal relations. In this paper, we propose a Symmetric Network with Spatial Relationship Modeling (SSM) method for NL-based vehicle retrieval. Specifically, we design a symmetric network to learn the unified cross-modal representations between text descriptions and vehicle images, where vehicle appearance details and vehicle trajectory global information are pre- served. Besides, to make better use of location information, we propose a spatial relationship modeling methods to take surrounding environment and mutual relationship between vehicles into consideration. The qualitative and quantitative experiments verify the effectiveness ofthe pro- posed method. We achieve 43.92% MRR accuracy on the test set of the 6th AI City Challenge on natural language- based vehicle retrieval track, yielding the 1st place among all valid submissions on the public leaderboard. The code is available at

If you have any questions, please leave a message below~~