Style Classification using Transfer Learning

Perform artist style classification on a subset of the painters by numbers data set available at https://www.kaggle.com/c/painter-by-numbers by adopting feature extraction and fine tuning techniques of transfer learning

OVERVIEW

Artist identification is traditionally performed by art historians and curators who have expertise with different artists and styles of art. This is a complex and interesting problem for computers because identifying an artist does not just require object or face detection; artists can paint a wide variety of objects and scenes. Additionally, many artists from the same time will have similar styles, and some artists have painted in multiple styles and changed their style over time. Previous work has attempted to identify artists by explicitly defining their differentiating characteristics as features. Instead of hand-crafting features, we use transfer learning where a model trained on one set of data is employed and adapted to our dataset of choice.

OBJECTIVES

    1. Train neural networks using transfer learning to obtain better artist identification performance compared to traditional SVM classification
    2. Explore and visualize the learned feature representation for identifying artists

METHODS

We develop and train five different CNN architectures using transfer learning for artist identification. In this project, we will consider two types of transfer learning: a feature-extraction based method and a fine-tuning-based method. We will be using networks that have been pre-trained on the ImageNet dataset and adapt them for our datasets.

Fine-tuning of AlexNet, VGG16 and ResNet18: Fine-tuning is aimed to adapt the existing filters to our data, but not move the parameters so far from the pre-trained parameters. We start with a pre-trained network to test whether or not a feature representation from ImageNet is a valuable starting point for artist identification. Some artists, for example Renaissance painters, used shapes and objects that you would expect to find in ImageNet since they usually painted lifelike scenes. However, other artists such as Cubists did not paint scenes as directly representative of the real world.

Feature Extraction +SVM on AlexNet and VGG: We will use the base network as a feature extractor. This means that we simply run the images through the pre-trained base network and take outputs from layer(s) as a feature representation of the image. These features are then used for classification using SVM. All our models are implemented in PyTorch. All experiments were implemented in Amazon Web Services using a machine with 4vCPUs, p2xlarge instance and 60GB storage.

CONCLUSIONS

The transfer learning-based networks performed better than the feature extraction method traditionally used. Extensive tuning of the hyper parameters led to maximised performance on the ResNet that was pre-trained on ImageNet. Also extracting the features before the last fully connected layer and then performing classification yields lower performance than fine-tuning. Also, the high probability along the diagonal of the confusion matrix is indicative of the high accuracy in classification. By viewing the saliency maps of the images we observe the network does not focus on a single area of the image to perform classification.

 

REFERENCES

[1] E. H. J. Li, L. Yao and J. Z. Wang. Rhythmic brushstrokes distinguish van gogh from his contemporaries: Findings via automated brushstroke extractions. IEEE Trans. Pattern Anal. Mach. Intell., 2012.

 

Bitnami