Project 1 - Phoenix Pham

Project 1: Images of the Russian Empire— Colorizing the Prokudin-Gorskii Photo Collection

[View Assignment]

Overview

The goal of this project is to take digitized Prokudin-Gorskii glass plate images and, using image processing techniques, automatically produce a color image with as few visual artifacts as possible. Each glass plate contains three grayscale exposures (Blue, Green, Red). The task is to extract these channels, align them correctly, and combine them into a single RGB image.

Approach

Using the provided skeleton Python code, I first imported the images, converted them to floats, and seperated each color channel (by thirds) before alignment. Then, I implemented two main alignment strategies: a naive exhaustive search for smaller images and a pyramid-based coarse-to-fine search for larger images, with additional preprocessing steps to improve alignment on difficult images (e.g., emir.tif's different color channels).

Naive Alignment

The naive approach exhaustively searches over a fixed displacement window ([-15, 15] pixels) and evaluates each shift using a similarity metric. I experimented with:

L2 norm: Measures squared pixel differences. Works on simple cases but is sensitive to brightness differences. I used this metric initially for the smaller images.
Normalized Cross-Correlation (NCC): Normalizes intensity. This produced better results on most images.

To improve accuracy, I cropped 20% of the image borders before scoring to avoid artifacts from dark borders and wrap-around introduced by shifting. This approach immediately worked for the smaller images (e.g., cathedral.jpg, monastery.jpg, tobolsk.jpg).

Pyramid Alignment

For larger .tif images, the naive approach became too slow. I implemented an image pyramid:

Recursively downsample the image by a factor of 2 until it is small enough.
Align at the smallest resolution using the naive method.
Scale the displacement estimate back up and refine locally (using a smaller refine_window) at each higher level.

Preprocessing (Edge Maps)

While pyramid + NCC worked well for almost all images, emir.tif was problematic due to its brightly colored blue robe. The robe is very bright in the blue channel but appears much darker in red, misleading intensity-based metrics.

To address this, I implemented a simple edge detector using gradient magnitude (finite differences in x and y). Aligning on edges rather than raw intensities improved alignment, since structural features (e.g., robe outline) are consistent across channels even if brightness is not. Note this preprocessing step worked for all images except icon.tif. This could be due to the many tiny edges/details in the image.

Problems

Before

After

painting.tif

G aligned to B: (-30, -1)

R aligned to B: (-70, -6)

Before

After