FISB: Feature-based Image Stitching Benchmark

Posted Apr 1, 2023

Feature-based image stitching pipeline

By Abu Shahid

2 min read

FISB: Feature-based Image Stitching Benchmark

Image stitching — sewing multiple photographs into a single panorama — sounds like a solved problem. Your phone does it. Google Street View does it. But when I tried to evaluate different stitching algorithms systematically during my time at IIT Jodhpur, I found that there was no standard benchmark to compare them against. So we built one.

This work, done with my friend Abhishek Rajora, addresses that gap: a benchmarking study of classical feature-based stitching pipelines, accompanied by a custom dataset of 49 scenes.

The Pipeline

Feature-based stitching has four sequential stages, and the choice at each stage compounds:

Feature Detection — find interesting keypoints in each image (SIFT, AKAZE, BRISK, ORB)
Feature Matching — match keypoints across image pairs (Brute Force, KNN, FLANN)
Geometric Transformation — estimate the homography via RANSAC
Blending — merge the warped images seamlessly (Alpha, Gaussian, MultiBand, Seamless Clone)

What We Found

More features ≠ better stitching

ORB detected 11,876 matches — three times more than SIFT’s 3,707. And yet its SSIM was 0.52 vs SIFT’s 0.96. Most of ORB’s matches are redundant or noisy. Speed comes with a heavy accuracy cost.

Detector	Matches	SSIM	PSNR
SIFT	3707	0.96	22.26
AKAZE	4210	0.89	20.62
BRISK	3798	0.92	18.54
ORB	11876	0.52	8.87

Seamless Clone wins decisively

Among blenders, Seamless Clone was the clear winner — and for an interesting reason. Unlike Alpha or Gaussian blending (which operate in pixel intensity space), Seamless Clone works in the gradient domain. It preserves edge transitions even when the images have different lighting conditions.

Blender	SSIM	PSNR
Alpha	0.55	7.98
Gaussian	0.49	12.23
MultiBand	0.92	18.49
Seamless Clone	0.96	22.26

FLANN trades accuracy for speed — unfavorably at small scale

FLANN uses approximate nearest neighbors (KD-tree for SIFT, LSH for ORB). At small dataset scale, the approximation costs more in quality than it saves in time. For real-time applications at scale it remains useful, but for a careful offline pipeline, Brute Force matching with SIFT was optimal.

The Dataset

No public benchmark existed for image stitching at the time. We created a 49-scene dataset covering:

Rotation, perspective shifts, zoom, viewpoint change, illumination variation
Both real and digitally synthesized images
Indoor and outdoor scenes

We also evaluated on Google Landmarks. Dataset available here.

Optimal Configuration (small dataset, offline use)

SIFT + Brute Force Matcher + Seamless Clone Blending

MultiBand is a strong runner-up. Full report here — GitHub repo here.

Research, Computer Vision

This post is licensed under CC BY 4.0 by the author.