Post

FISB: Feature-based Image Stitching Benchmark

FISB: Feature-based Image Stitching Benchmark

Image stitching — sewing multiple photographs into a single panorama — sounds like a solved problem. Your phone does it. Google Street View does it. But when I tried to evaluate different stitching algorithms systematically during my time at IIT Jodhpur, I found that there was no standard benchmark to compare them against. So we built one.

This work, done with my friend Abhishek Rajora, addresses that gap: a benchmarking study of classical feature-based stitching pipelines, accompanied by a custom dataset of 49 scenes.

The Pipeline

Feature-based stitching has four sequential stages, and the choice at each stage compounds:

  1. Feature Detection — find interesting keypoints in each image (SIFT, AKAZE, BRISK, ORB)
  2. Feature Matching — match keypoints across image pairs (Brute Force, KNN, FLANN)
  3. Geometric Transformation — estimate the homography via RANSAC
  4. Blending — merge the warped images seamlessly (Alpha, Gaussian, MultiBand, Seamless Clone)

What We Found

More features ≠ better stitching

ORB detected 11,876 matches — three times more than SIFT’s 3,707. And yet its SSIM was 0.52 vs SIFT’s 0.96. Most of ORB’s matches are redundant or noisy. Speed comes with a heavy accuracy cost.

DetectorMatchesSSIMPSNR
SIFT37070.9622.26
AKAZE42100.8920.62
BRISK37980.9218.54
ORB118760.528.87

Seamless Clone wins decisively

Among blenders, Seamless Clone was the clear winner — and for an interesting reason. Unlike Alpha or Gaussian blending (which operate in pixel intensity space), Seamless Clone works in the gradient domain. It preserves edge transitions even when the images have different lighting conditions.

BlenderSSIMPSNR
Alpha0.557.98
Gaussian0.4912.23
MultiBand0.9218.49
Seamless Clone0.9622.26

FLANN trades accuracy for speed — unfavorably at small scale

FLANN uses approximate nearest neighbors (KD-tree for SIFT, LSH for ORB). At small dataset scale, the approximation costs more in quality than it saves in time. For real-time applications at scale it remains useful, but for a careful offline pipeline, Brute Force matching with SIFT was optimal.

The Dataset

No public benchmark existed for image stitching at the time. We created a 49-scene dataset covering:

  • Rotation, perspective shifts, zoom, viewpoint change, illumination variation
  • Both real and digitally synthesized images
  • Indoor and outdoor scenes

We also evaluated on Google Landmarks. Dataset available here.

Optimal Configuration (small dataset, offline use)

SIFT + Brute Force Matcher + Seamless Clone Blending

MultiBand is a strong runner-up. Full report here — GitHub repo here.

This post is licensed under CC BY 4.0 by the author.