Aligning RGB images and then processing the results

Overview of project

In this project, I developed three algorithms that tackle the task of image alignment similarly. For handling smaller JPEG images with fewer pixels and smaller displacements, I used normal Mean Squared Error (MSE) for each pixel in the middle part of the image (to ignore the edges), and then I shifted the image one pixel at a time to find the best alignment. For larger images, this was not effective since the displacement could be way more than 15 pixels and therefor the search takes much longer, this and that there is to many pixels to effectevily compare them, so I compressed the image using convolutions with 3x3 kernels and a stride of three and performed average pooling. The choice of stride and kernel size to match was in order to compress the image effectively using every pixel only once. This was done twice, compressing the image to be 1/81 the size of the original image. After that, I found the best alignment using the same algorithm as for the JPEGs.

Moving up a level in the pyramid, I shifted the image based on the findings at the lower level and repeated, now the image was 1/9 the size of the original image, which resulted in a better match. The final shift was then done on the original image to get the best possible alignment. For some images, pixel-based alignment was not sufficient, so I developed an algorithm based on edges and features instead, using a Sobel filter followed by the same pyramid alignment procedure. This worked well, and the results are presented below.

After the image alignment, I also made some automatic refinements to enhance the final image First, I cropped the image to remove the edges of the frame, which often contain unwanted color artifacts or white spaces. This was achieved by removing rows and columns with low variance compared to the rest of the image, ensuring that only the significant portion of the image is retained. However finding a good threshold of minimum variance to remove was difficult since it varied from picture to picture and so I settled on removing rows and columns with variance lower than 25% of the variance of the entire picture. This worked best in the general case of edge removal. After cropping, I boosted the contrast using CLAHE (Contrast Limited Adaptive Histogram Equalization), like recommended in the course literature. Instead of blurring like the Gaussian filter which I tried at first, CLAHE enhances the image by adjusting the contrast in small sections, making details stand out more in low-contrast areas. It also limits the contrast boost to avoid over-enhancing, using a clipLimit parameter to control how much contrast is added in each part. This keeps the overall contrast balanced while highlighting finer details. Finally, I made sure the pixel values stay within a valid range to preserve the color and look of the image.

TIF Image Before Processing and Alignment

Image before processing and alignment

The image above shows the original image before processing and alignment without shifts.

New image for comparison

The image above is the original picture that we start with, of the three channels for red green and blue before alignment.

TIF Image After Alignment and Processing

Image after alignment and processing

The image above shows the processed image after applying the pyramid alignment function.

Image after alignment and processing

The image above shows the processed image after applying the alignment and image enhancement algorithms. Notice how most of the edges are removed and the contrast is increased.

Sobel filter when pixel comparison does not work

When just plain pixel comparison does not work as in the case of Emir due to difference in intensity of the images, we have to try another approach to align the images. Instead we compare edges in the pictures, in order to do this we have to extract the edges in the images. This is done by letting the image pass through a sobel filter after passing through a gaussian filter which produces an image highlighting the edges in the picture which works the same for all images invariant to the intensity of the pictures. When the edge focused images are extracted I perform the pyramid alignment algorithm and it works fine aligning the pictures well, however not as perfect as the pyramid alignment on the processed images. Below image illustrating the results from letting an image pass through the sobel filter.

TIF Image after passing through the sobel filter

Image after alignment and processing
Green Emir image

The images above show the preprocessed image after applying the Sobel filter (left) and the green channel of the image (right). Notice how the edges in the left image are lit up, which helps during alignment.

TIF Image after contrast enhancement

Image after alignment and processing
Aligned chruch

The images above show the processed image after applying the CLAHE contrast enhancement algorithm and the automatic edgecrop (left) and the stacked image without processing(right). Notice how the contrast is increased and some of the edges surrounding the pictures have disapeared. The reason some are left is due to the fact that there is still part of the image "hiding" under the edges and the variance is high enough for it to be still in the picture as well as the number in the top right corner which is still part of the edge but making the columns have a high enough variance to not get cropped. However I think this is good since this part is valuable information for the image. But we can see that the pink edge at the top as well as the bottom edge is removed since it contributes no information about the picture which is a good removal. I wanted to keep all parameters constant hence some images show more edges than others after the automatic cropping due to the large difference in variance between images and the edges being rather constant.

Results (Displacement is in amount of pixels (x,y) )

Result 1: Train

Result 1

R:(32, 85), G:(7, 42)

Alignment function: Standard Pyramid

Result 2: Church

Result 2

R:(-4, 58), G:(4, 25)

Alignment function: Standard Pyramid

Result 3: Sculpture

Result 3

R:(-27, 140), G:(-11, 33)

Alignment function: Standard Pyramid

Result 4: Harvesters

Result 4

R:(15, 123), G:(18, 51)

Alignment function: Edge Pyramid

Result 5: Emir

Result 5

R:(36, 96), G:(21, 42)

Alignment function: Edge Pyramid

Result 6: Tobolsk

Result 6

R:(3, 7), G:(3, 3)

Alignment function: JPG Align

Result 7: Cathedral

Result 7

R:(3, 12), G:(2, 5)

Alignment function: JPG Align

Result 8: Icon

Result 8

R:(23, 87), G:(18, 41)

Alignment function: Standard Pyramid

Result 9: Onion Church

Result 9

R:(37, 105), G:(26, 50)

Alignment function: Standard Pyramid

Result 10: Lady

Result 10

R:(14, 114), G:(8, 51)

Alignment function: Standard Pyramid

Result 11: Self Portrait

Result 11

R:(37, 175), G:(29, 77)

Alignment function: Standard Pyramid

Result 12: Three Generations

Result 12

R:(12, 108), G:(15, 51)

Alignment function: Standard Pyramid

Final Remarks

I have never made a website before so I took some help from ChatGPT with the HTML code for presenting my images next to each other, oterwise it was pretty straight forward, kind of similar to latex. Otherwise everythig is done by myself and using the course literature. A very intersting first project and I had a lot of fun doing it but also a little bit frustrating. Hope you enjoyed the reading and the images.