Benchmarking Long-term Visual Localization

Datasets

Localization datasets

Below you may find some general information about, and links to, the visual localization datasets. For more detailed documentation about the organization of each dataset, please refer to the accompanying readme file for each dataset. The license terms and conditions are also laid out in the readme files.

Aachen Day-Night datasets

The Aachen Day-Night datasets, which are based on the original Aachen dataset, depicts the old inner city of Aachen, Germany. The database images used to build the reference scene representations were all taken during daytime with hand-held cameras over a period of about two years. The datasets offers query images taken at daytime and at nighttime. All query images were taken with mobile phone cameras, i.e., the Aachen Day-Night dataset considers the scenario of localization using mobile devices, e.g., for Augmented or Mixed Reality. The nighttime query images were taken using software HDR to create (relatively) high-quality images that are well-illuminated.

Features of the Aachen Day-Night dataset

Reference images:	4,328
Query images:	922 (824 daytime, 98 nighttime)

The dataset also provides additional daytime images that are not part of the reference scene representation. While no ground truth camera poses are provided for these images, they can be used as part of the benchmark.

Features of the Aachen Day-Night v1.1 dataset

Reference images:	6,697
Query images:	1015 (824 daytime, 191 nighttime)

Download Aachen Day-Night datasets

CMU-Seasons dataset

The CMU Seasons dataset, which is based on a subset of the CMU Visual localization dataset by Badino et al. It depicts urban, suburban, and park scenes in the area of Pittsburgh, USA. The reference and query images were captured by two front-facing cameras mounted on a car, pointing to the left/right of the vehicle at approximately 45 degrees. The images were recorded over a period of 1 year. One such traversal is used to define a reference condition and the reference scene representation. Other traversals, capturing different seasonal conditions, are used for query. All images were recorded in sequences. The CMU Seasons dataset represents an autonomous driving scenario, where it is necessary to localize images taken under varying seasonal conditions against a (possibly outdated) reference scene representation.

Features of the CMU Seasons dataset:

Reference images:	7,159
Query images:	75,335

Download CMU Seasons dataset

Extended CMU-Seasons dataset

This is an extended version of the CMU Seasons dataset, containing roughly 40% more images. Specifically, unlike in the CMU Seasons dataset, where camera poses for only a single condition was released, we here release roughly half of all camera poses for all conditions. The remaining half constitute the private test set you may evaluate your methods on. This dataset supercedes the old CMU Seasons dataset, and we strongly recommend everyone to use this dataset instead of the now deprecated CMU Seasons dataset.

Features of the Extended CMU Seasons dataset:

Reference images:	60,937
Query images:	56,613

Download Extended CMU-Seasons dataset

RobotCar Seasons

The RobotCar Seasons dataset, which is based on a subset of the RobotCar dataset, depicts the city of Oxford, UK. The reference and query images were captured by three synchronized cameras mounted on a car, pointing to the rear-left, rear, and rear-right, respectively. The images were recorded by driving the same route over a period of 12 months. One traversal is used to define a reference condition and the reference scene representation. Other traversals, covering different seasonal and illumination conditions, are used for query. All images were recorded in sequences. The RobotCar Seasons dataset represents an autonomous driving scenario, where it is necessary to localize images taken under varying seasonal conditions against a (possibly outdated) reference scene representation. In contrast to the CMU Seasons dataset, it also contains images taken at nighttime. Compared to the Aachen Day-Night dataset, the nighttime images of the RobotCar Seasons dataset exhibit significantly more motion blur and are of lower image quality.

Features of the RobotCar Seasons dataset:

Reference images:	26,121
Query images:	11,934

Download RobotCar Seasons dataset

InLoc Dataset

The InLoc dataset is designed for large-scale indoor localization that contains significant variation in appearance between queries and the 3D database due to large viewpoint changes, moving furniture, occluders or changing illumination. The dataset is composed of a database of RGBD images geometrically registered to the floor maps augmented with a separate set of RGB query images taken by hand-held devices to make it suitable for the task of indoor localization [Taira, Okutomi, Sattler, Cimpoi, Pollefeys, Sivic, Pajdla, Torii. InLoc: Indoor Visual Localization with Dense Matching and View Synthesis. CVPR18]. The database images are, in total, 9,972 perspective images generated based on the indoor RGBD dataset [Wijmans, Furukawa. Exploiting 2D floorplan for building-scale panorama RGBD alignment. CVPR17] which consists of 277 RGBD panoramic images obtained from scanning two buildings at the Washington University in St. Louis. The query images consist of 329 photos using a smart-phone camera (iPhone 7) annotated with manually verified ground-truth 6DoF camera poses (reference poses) in the global coordinate system of the 3D map.

Features of the InLoc dataset:

Reference images:	9,972
Query images:	329

Download InLoc dataset Alternative link

SILDa Weather and Time of Day Dataset

The SILDa Weather and Time of Day dataset, which is based on the Scape-Imperial Localisation Dataset (SILDa), represents localization in real world conditions, using raw images from an entry level spherical camera. This covers a wide range of high end applications like virtual reality, mapping and robotics. The dataset was captured over a period of 12 months and covers the 1.2 km of streets around Imperial College in London. Conditions include changes in weather (clear, snow, rain) and time of day (noon, dusk, night). For training, the clear-day and rainy-day conditions are used. For testing the snow-day, clear-dusk, and clear-night are used as query images.

SILDa Weather and Time of Day dataset:

Reference images (spherical):	8'334
Query images (spherical):	6'064

Download SILDa Weather and Time of Day dataset

Symphony Seasons Dataset

The Symphony Seasons dataset, which is based on a subset of the Symphony dataset, depicts the 1.3km shore of Symphony Lake in Metz, France. The reference and query images were captured by a pan-tilt-zoom (PTZ) camera on an unmanned surface vehicle. The camera faces starboard as the boat moves along the shore while maintaining a constant distance. The boat is deployed on average every 10 days from Jan 6, 2014 to April 3, 2017. One traversal is used to define a reference condition. Other traversals, capturing different seasonal and illumination conditions, are used for queries. All images were recorded in sequences. The Symphony Seasons dataset represents an autonomous driving/monitoring scenario, where it is necessary to localize images taken under varying seasonal conditions against a (possibly outdated) reference scene representation. In contrast to the CMU Seasons and the Robocat Seasons dataset, it holds a wider range of lighting, weather and season conditions and much less texture and human-made features, which challenges existing localization methods.

Symphony Seasons dataset:

Reference images:	1'409
Query images:	135'966

Download Symphony Seasons dataset

Gangnam Station and Hyundai Department Store Datasets

The Gangnam Station and Hyundai Department Store datasets are part of the NAVER LABS localization datasets, consisting of 5 indoor datasets for visual localization in challenging real-world environments. They were captured in a large shopping mall and a large metro station in Seoul, South Korea, using a dedicated mapping platform consisting of 10 cameras and 2 laser scanners. In order to obtain accurate ground truth camera poses, robust LiDAR SLAM was used to provide initial poses that were then refined using a novel structure-from-motion based optimization. The datasets are provided in the kapture format and contain about 130k images as well as 6DoF camera poses for training and validation. Also provided are sparse Lidar-based depth maps for the training images.

The datasets can be downloaded from the NAVER LABS website and using the kapture dataset downloader.

Gangnam Station and Hyundai Department Store datasets:

Reference images:	21'054 (Gangnam Station) and 44'283 (Hyundai Department Store)
Query images:	6'122 (Gangnam Station) and 5'927 (Hyundai Department Store)

Download NAVER LABS datasets

ETH-Microsoft Dataset

The ETH-Microsoft dataset focuses on Augmented Reality scenarios. The data covers day and night illumination changes, large indoor and outdoor environments, and different sensor configurations for handheld and head-mounted devices. This data was captured at the HG building of the ETH Zurich campus, both in the main halls and on the sidewalk, over the course of several months. This environments is challenging as it exhibits many self-similarities and symmetric structures. The reference images were acquired by a NavVis M6 mobile scanner while query images were captured by phones and HoloLens2 devices. More information at ETH-Microsoft Dataset.

Features of the ETH-Microsoft Dataset:

Reference images:	4,914(6 * 819)
Query images:	300 single-image queries and 300 4-camera rigs

Download ETH-Microsoft Dataset

Additional datasets

Cross-Seasons Correspondence Dataset

The two available Cross-Season Correspondence datasets are created from the Extended CMU-Seasons dataset and the RobotCar Seasons dataset respectively. Each sample of the datasets contains two nearby images taken during different seasons or weather conditions as well as a set of 2D-2D point correspondences between the images. The correspondences have been automatically established using geometric 3D consistency between the two points. The CMU Seasons Correspondence Dataset contains 28766 image pairs from different seasonal conditions while the Oxford RobotCar Correspondence Dataset contains 6511 image pairs covering different seasonal and illumination conditions. For more details, see the corresponding article here.

Download Correspondence dataset