Localisation Demo for Central Cambridge, United Kingdom

Or upload an image file

Or use one of these example images that we obtained from the internet:

We have localised your input image on the map below! The blue arrow shows where we think you are. The image must have been taken within the blue highlighted region on the map.

This is the closest view in Google Maps Street View to the blue arrow.

Our system, PoseNet, is able to estimate your location and orientation from a single colour image. It takes <2ms to relocalise within a few meters and a few degrees in large scale urban scenes. This is more accurate than GPS and can also do things which GPS cannot, such as determining your orientation, and operating indoors.

How does it work?

We have applied deep convolutional neural networks to camera pose regression. Our system is simple in the fact that we train a system end-to-end to regress camera pose. Unlike other systems ours does not require a large database or landmarks. Instead, it learns robust high level features. It can deal with many different camera types, motion blur, weather, pedestrians and other distractions. It is highly scalable, requiring only a few MB of memory, and <2ms per image, to relocalise within large urban scenes.

The video to the right shows a technical summary of the system.

Deep Convolutional Neural Networks for Regression

Our system requires training data to learn to localise in an environment. We leverage transfer learning from large scale classification datasets to learn with relatively small amounts of training data. We show that we can learn a smooth function of camera pose which can operate throughout the scene. This image shows training examples, testing examples and our resulting pose prediction in red.

Code and Dataset

PoseNet was trained with the Cambridge Landmarks Dataset. This is a large urban relocalisation dataset with 6 scenes from around Cambridge University containing over 12,000 images labelled with their full 6-DOF camera pose.

You can download the dataset from Cambridge University DSpace using the links below. A selection can be visualised online in your browser. Each scene contains the training and testing images (.png), original video (.mp4), text files containing the camera poses (.txt) and the scene's reconstruction (.nvm can be opened with Visual SFM).

Scene Download Link Visualisation
King's College Download Visualise
Street Download Visualise
Old Hospital Download
Shop Facade Download
St Mary's Church Download
Trinity Great Court Download

The PoseNet code and Cambridge Landmarks dataset are released for non-commercial research only. For commercial use, please contact us. If you find PoseNet useful, please cite our publications in your work.


Alex Kendall and Roberto Cipolla "Geometric loss functions for camera pose regression with deep learning." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.     
( .pdf )     ( bibtex )

Alex Kendall and Roberto Cipolla "Modelling Uncertainty in Deep Learning for Camera Relocalization." Proceedings of the International Conference on Robotics and Automation (ICRA), 2016.     
( .pdf )     ( bibtex )

Alex Kendall, Matthew Grimes and Roberto Cipolla "PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization." Proceedings of the International Conference on Computer Vision (ICCV), 2015.     ( .pdf )     ( bibtex )     ( poster )

Get in touch: agk34@cam.ac.uk