A few weeks ago, I was having a discussion about the possibility of detecting the colour of user uploaded images. The idea was that if a dominant colour was known, it could be used to create matching titles. I decided to give it a go using Python and Pillow.
Naively I tried averaging all the colours. This didn’t work so well.
Here is the result on one of my photos from Paris.
The green/blue/grey colour is the average of all the pixels. While I can see how this colour is the average (green and blue with bits of grey), it doesn’t represent a dominant colour of the image.
I thought a better approach could be to determine the most commonly occurring colour.
This worked surprisingly well. The Pillow
image.getColors() method conveniently returns a tuple containing all the colours in the image and the number of times the colour occurred.
This is the result on the same image.
The selected colour nicely matches the image.
The most frequent colour method starts to run into problems when its run against photos which are over or under exposed.
This is photo I took in the West Australian Wheat Belt while I was on a climbing trip. Although its not too bad, the distant sun has caused a small area to be overexposed.
Yep, white is the most common colour in this image. The overexposed sun caused a small area to be clipped. This small patch of white is enough to be the most common colour in the image.
The K-Means algorithm works by separating the pixels into K groups (clusters) of similarly coloured pixels.
This post by Charles Leifer explains the process well.
To understand this implementation of the algorithm, you need to grasp that a RGB colour value is really just a point in 3D space. Once you understand how this relates to a clustering algorithm, the rest is fairly simple.
First, pick K random pixels from your data set. Use these as the initial centroids for each cluster. You then repeat the following steps until an exit condition is satisfied.
- Assign the pixels to the closest cluster.
- Calculate a new centroid for each cluster by averaging all the pixels.
- Repeat from 1, start reassigning pixels based on the new centroid.
After a number of iterations, the centroids will begin to stabilise, this is a good way to determine the exit condition. In my experience this can often happen within only a few iterations.
The algorithm can be pretty slow. I have used numpy arrays where possible, although I’m sure further optimisations are possible. A quick profile shows that most time is spent in the
calcDistance() method. Look into KD-Trees or the Scipy Kmeans function if you really want to speed things up.
These samples were run with k=3, and the min_distance=2.0
The results are usually quite good. Although because the starting centroids are chosen randomly, the algorithm can return slightly different results each time.
The Eiffel Tower test image.
Kmeans cluster map
Some cows I found found near Bunbury.
The test overexposed image
The results from Kmeans seem to be pretty good. Some photos you will need to tweak K a bit to get the right number of clusters.
If you want to take this further, there are algorithms which can determine the ideal number of clusters to use. It’s also likely that better results can be achieved using a different colour space, like HSV or lab.
This project is also on GitHub: https://github.com/ZeevG/python-dominant-image-colour