Photo/Art classifier

I have created a little program to determine if an image is a photograph or a piece of drawn artwork. It works, to some extent. It isn't actually capable of reliably determining if a file is a photograph or not, as some styles of art have statistical properties surprisingly similar to a phorograph, but it can accurately classify some artworks as such. Or, put another way, when classifying files as art it has a very high false negative rate but a very low false positive rate.

Example of the 'product shot' image.
A product shot. These need to be handled a little differently.

In terms of usefulness, I can envision it being used by any site which is intended to share photos but not other image content. It's accuracy in that role would not be perfect, but sufficient to flag images up for administrator attention.

The algorithm itsself isn't too complicated. It just looks for areas of near-perfectly-flat color or slight, even gradient. A feature characteristic of some art styles (though not all), but almost never found in a photo. Then returns a 'photo-ness' measure, the area of the image (from 0 to 1, it's a float) that consists of such near-perfect flatness. A simple morphological filter improves accuracy a little, but there's not much to it. There isn't much room to improve the basic concept - if you want more accuracy, the best approach would probably be to come up with several different approaches then construct a composite metric based on them all, or feed them into a machine learning algorithm. Also, looking for EXIF information could probably identify a lot of photos without even examining the image.

The classification process runs through a few steps:
1. After loading the image, for each 3x3 neighbourhood replace center pixel color with the pixel color nearest in RGB space to the mean value, limited by a threshhold. This step is optional - it's just a quite potent JPEG artifact remover. It's the VBM_closest_mean_filter() function from VBitmapFunctions.
2. Generate a blurred image. Iterated box blur - its mathematical properties are very similar to a true gaussian, but it's faster and easier.
3. Check the values of four pixels - those four pixels in from each corner. If they almost match, this may be a 'product shot' - a photograph of an object on a pure background for display purposes. Doesn't matter just yet, but they need to be processed differently later.
4. Calculate a mask using a simple threshhold on a very low value (3, by default) on the sum-of-differences between the blurred an unblurred images.
5. Apply the cellular filter from VBitmapFunctions to clear up the mask.
6a. If the image was not identified as a product shot, simply calculate what proportion of the image is not covered by the mask.
6b. If the image was identified as a product shot, likewise - but only count the proportion that isn't within a few color values of those corner pixels.

It's not sophisticated enough to worry about licensing, so just play with the code and see what you can do. Fiddling with blur radius (It's actually an iterated box, they are easier to write than true gaussian) and detection threshhold might give a slight improvement, but I feel that this approach alone isn't going to achieve good accuracy.

An example illustration. Note that it consists mostly of flat colors, but they aren't perfectly flat. There are quantitisation bands from the slight gradients top-to-bottom and within the clouds. These pose no difficulty: The blur-and-difference function was chosen because it is unaffected by steady gredients, still classifying them as if they were flat colors. Also note that Twilight Sparkle remains Best Pony, regardless of how often Rainbow appears as a test image.


Using the blur-delta-threshhold-erode filtering, flat areas of the image are revealed. Only 0.168 of the image is not classified as 'flat' - a number so low indicates a very high probability that this is an illustration. This is an image well-suited to classification by this means: Traditional artistic mediums are more likely to be mistaken for a photo, but very few photos will be mistaken for illustrations by this method.

Testing upon a smal library of images demonstrates the accuracy of the program. It cannot reliably positively identify an image as a photograph, but it can positively identify an image as an illustration - very few photos will ever score below about 0.25.
./test_ill14.bmp, Photofactor: 0.167934
./test_ill10.bmp, Photofactor: 0.185325
./test_ill9.bmp, Photofactor: 0.201024
./test_ill4.bmp, Photofactor: 0.228443
./test_ill7.bmp, Photofactor: 0.243940
./test_ill6.bmp, Photofactor: 0.252553
./test_ill3.bmp, Photofactor: 0.256953
./test_pho12.bmp, Photofactor: 0.288751
./test_ill12.bmp, Photofactor: 0.316458
./test_pho8.bmp, Photofactor: 0.363802
./test_ill13.bmp, Photofactor: 0.379516
./test_ill.bmp, Photofactor: 0.382185
./test_pho3.bmp, Photofactor: 0.399334
./test_pho13.bmp, Photofactor: 0.442190
./test_pho4.bmp, Photofactor: 0.463278
./test_pho7.bmp, Photofactor: 0.464156
./test_pho5.bmp, Photofactor: 0.487123
./test_ill11.bmp, Photofactor: 0.498605
./test_ill5.bmp, Photofactor: 0.520663
./test_pho2.bmp, Photofactor: 0.611604
./test_pho11.bmp, Photofactor: 0.640047
./test_pho14.bmp, Photofactor: 0.654414
./test_pho1.bmp, Photofactor: 0.725961
./test_ill2.bmp, Photofactor: 0.830203
./test_pho9.bmp, Photofactor: 0.854278
./test_pho6.bmp, Photofactor: 0.867393
./test_ill8.bmp, Photofactor: 0.892881