Visual localization is essential for mobile robotics and augmented reality. However, most existing methods require hundreds of training images to perform well. Recent technique 3D Gaussian Splatting enables realistic novel view synthesis, offering a promising foundation for localization. We introduce LoGS, a hierarchical system that adapts Gaussian Splatting for few-shot localization. Our experiments show that LoGS achieves state-of-the-art accuracy using a limited number of training images—in some cases, even outperforming previous methods in full-shot settings.
LoGS introduces an efficient pipeline for few-shot localization leveraging Gaussian Splatting. The system consists of two stages: map construction with limited training images, and robust online localization based on 1) geometric correspondence and 2) differentiable optimization. Details can be found in our paper.
The demo video shows qualitative results. We demonstrate the live localization performance of LoGS in an indoor environment with a few-shot pre-trained map.
7-Scenes Localization Results. Poses in the first table is with DSLAM ground truth while that in the second is with SfM ground truth. The cell content is median pose error (cm / °). Red: best. Blue: second best.
Methods (DSLAM) | #Images | Original training | #Images | Few-shot training | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
AS | HLoc | HSCNet | DSAC* | ACE | Ours | HLoc | DSAC* | HSCNet | SP+Reg | FSRC | Ours | |||
CHESS | 4000 | 3/0.87 | 2/0.85 | 2/0.7 | 2/1.10 | 2/0.7 | 2.0/0.62 | 20 | 4/1.42 | 3/1.16 | 4/1.42 | 4/1.28 | 4/1.23 | 3/1.00 |
FIRE | 2000 | 2/1.01 | 2/0.94 | 2/0.9 | 2/1.24 | 2/0.9 | 1.8/0.70 | 10 | 4/1.72 | 5/1.86 | 5/1.67 | 5/1.95 | 4/1.53 | 2/0.90 |
HEADS | 1000 | 1/0.82 | 1/0.75 | 1/0.9 | 1/1.82 | 1/0.6 | 1.0/0.64 | 10 | 4/1.59 | 4/2.71 | 3/1.76 | 3/2.05 | 2/1.56 | 2/0.99 |
OFFICE | 6000 | 4/1.15 | 3/0.92 | 3/0.8 | 3/1.15 | 3/0.8 | 2.4/0.69 | 30 | 5/1.47 | 9/2.21 | 9/2.29 | 7/1.96 | 5/1.47 | 4/1.13 |
PUMPKIN | 4000 | 7/1.69 | 5/1.30 | 4/1.0 | 4/1.34 | 4/1.1 | 4.0/1.03 | 20 | 8/1.70 | 7/1.68 | 8/1.96 | 7/1.77 | 7/1.75 | 7/1.85 |
REDKITCHEN | 7000 | 5/1.72 | 4/1.40 | 4/1.2 | 4/1.68 | 4/1.3 | 3.4/1.13 | 35 | 7/1.89 | 7/2.02 | 10/2.63 | 8/2.19 | 6/1.93 | 5/1.64 |
STAIRS | 2000 | 4/1.01 | 5/1.47 | 3/0.8 | 3/1.16 | 4/1.1 | 3.2/0.81 | 20 | 10/2.21 | 18/4.8 | 13/4.24 | 120/27.37 | 5/1.47 | 7/1.85 |
Methods (SfM) | #Images | Absolute Pose Regression | Scene Coordinate Regression | Analysis-by-Synthesis | #Images | Ours | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
MS-Transf | Marepo | DFNet | DSAC* | ACE | GLACE | MCLoc | NeFeS | NeRFMatch | Ours | ||||
CHESS | 4000 | 11/6.4 | 1.9/0.83 | 3/1.1 | 0.5/0.17 | 0.5/0.18 | 0.6/0.18 | 2/0.8 | 2/0.8 | 0.9/0.3 | 0.4/0.10 | 20 | 0.5/0.16 |
FIRE | 2000 | 23/11.5 | 2.3/0.92 | 6/2.3 | 0.8/0.28 | 0.8/0.33 | 0.9/0.34 | 3/1.4 | 2/0.8 | 1.1/0.4 | 0.6/0.18 | 10 | 0.8/0.26 |
HEADS | 1000 | 13/13.0 | 2.1/1.24 | 4/2.3 | 0.5/0.34 | 0.5/0.33 | 0.6/0.34 | 3/1.3 | 2/1.4 | 1.5/1.0 | 0.5/0.26 | 10 | 0.7/0.48 |
OFFICE | 6000 | 18/8.1 | 2.9/0.93 | 6/1.5 | 1.2/0.34 | 1/0.29 | 1.1/0.29 | 4/1.3 | 2/0.6 | 3.0/0.8 | 0.7/0.22 | 30 | 1.2/0.34 |
PUMPKIN | 4000 | 17/8.4 | 2.5/0.88 | 7/1.9 | 1.2/0.28 | 1.2/0.28 | 1/0.22 | 5/1.6 | 2/0.6 | 2.2/0.6 | 0.7/0.22 | 20 | 1.1/1.29 |
REDKITCHEN | 7000 | 16/8.9 | 2.9/0.98 | 7/1.7 | 0.7/0.21 | 0.8/0.20 | 0.8/0.20 | 6/1.6 | 2/0.6 | 1.0/0.3 | 0.5/0.14 | 35 | 0.9/.022 |
STAIRS | 2000 | 29/10.3 | 5.9/1.48 | 12/2.6 | 2.7/0.78 | 2.9/0.81 | 3.2/0.93 | 6/2.0 | 5/1.3 | 10.1/1.7 | 1.6/0.43 | 20 | 4.1/1.10 |
Cambridge Landmarks Localization Results. The cell content is median pose error (cm / °). Red: best. Blue: second best.
Methods (SfM) | #Images | Original training (median pose error in cm/°) | #Images | Few-shot training (median pose error in cm/°) | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
AS | HLoc | SCRNet | HSCNet | DSAC* | NeRFMatch | Ours | HLoc | DSAC* | HSCNet | SP+Reg | FSRC | Ours | |||
GREATCOURT | 1531 | 24/0.13 | 16/0.11 | 125/0.6 | 28/0.2 | 49/0.3 | 17.5/0.1 | 12.7/0.09 | 16 | 72/0.27 | NA | NA | NA | 81/0.47 | 68/0.20 |
KINGS-COLLEGE | 1220 | 13/0.22 | 12/0.20 | 21/0.3 | 18/0.3 | 15/0.3 | 13.0/0.2 | 10.8/0.19 | 13 | 30/0.38 | 156/2.09 | 47/0.74 | 111/1.77 | 39/0.69 | 24/0.33 |
OLDHOSPITAL | 895 | 20/0.36 | 15/0.30 | 21/0.3 | 19/0.3 | 21/0.4 | 19.4/0.4 | 14.6/0.31 | 9 | 28/0.42 | 135/2.21 | 34/0.41 | 116/2.55 | 38/0.54 | 28/0.43 |
SHOPFACADE | 229 | 4/0.21 | 4/0.20 | 6/0.3 | 6/0.3 | 5/0.3 | 8.5/0.4 | 4.1/0.19 | 3 | 27/1.75 | NA | 22/1.27 | NA | 19/0.99 | 39/2.39 |
STMARYSCHURCH | 1487 | 8/0.25 | 7/0.21 | 16/0.5 | 9/0.3 | 13/0.4 | 7.9/0.3 | 6.9/0.20 | 15 | 25/0.76 | NA | 292/8.89 | NA | 31/1.03 | 22/0.67 |
LLFF and Mip-NeRF 360 Localization Results. The cell content is accuracy (<0.05 unit / <5°). Red: best. Blue: second best.
Methods (SfM) | iNerf (δs) | iComMa (δs) | iComMa (δm) | Ours | Ours (few-shot) |
---|---|---|---|---|---|
LLFF | 94.8/72.2 | 99.1/99.3 | 75.4/98.2 | 100/100 | 100/100 |
Mip-NeRF 360 | 85.6/79.6 | 86.7/90.6 | 68.8/74.8 | 100/100 | 94.7/99.9 |
If you found our work/code useful, please consider citing our publication:
@inproceedings{cheng2025logs, title = {LoGS: Visual Localization via Gaussian Splatting with Fewer Training Images}, author = {Cheng, Yuzhou and Jiao, Jianhao and Wang, Yue and Kanoulas, Dimitrios}, booktitle = {Proceedings of the IEEE International Conference on Robotics and Automation (ICRA)}, year = {2025}, organization = {IEEE} }