Cifar-10 Classification using Keras Tutorial

Share on Facebook0Share on Google+0Tweet about this on Twitter0Share on LinkedIn0

The CIFAR-10 data set consists of 60000 32×32 color images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.

Recognizing photos from the cifar-10 collection is one of the most common problems in the today’s  world of machine learning. I’m going to show you – step by step – how to build multi-layer artificial neural networks that will recognize images from a cifar-10  set with an accuracy of about 80% and visualize it.

If you want to check the whole project in your browser just open it in your browser using PLON –  our data science platform.

Building 4 and 6-layer Convolutional Neural Networks

To build our CNN (Convolutional Neural Networks)  we will use Keras and introduce few newer techniques for Deep Learning model like activation functions: relu, dropout.

Keras is an open source neural network Python library which can run on top of other machine learning libraries like TensorFlow, CNTK or Theano. It allows for an easy and fast prototyping, supports convolutional,  recurrent neural networks and a combination of the two.

 

At the beginning we will learn what Keras is, deep learning, what we will learn, and briefly about the cifar-10 collection. Then step by step, we will build a 4 and 6 layer neural network along with its visualization, resulting in % accuracy of classification with graphical interpretation.

Finally, we will see the results and compare the two networks in terms of the accuracy and speed of training for each epoch.

 

The CIFAR-10 DATASET

The dataset is divided into five training batches and one test batch, each with 10000 images.

The test batch contains exactly 1000 randomly-selected images from each class.

The training batches contain the remaining images in random order, but some training batches may contain more images from one class than another.

Between them, the training batches contain exactly 5000 images from each class.

You can download it from here.

Convolutional Neural Networks – The Code

First of all, we will be defining all of the classes and functions we will need:

As a good practice suggests, we need to declare our variables:

  • batch_size – the number of training examples in one forward/backward pass. The higher the batch size, the more memory space you’ll need
  • num_classes – number of cifar-10 data set classes
  • one epoch – one forward pass and one backward pass of all the training examples

 

 

Next, we can load the CIFAR-10 data set.

Print figure with 10 random images from cifar dataset

Running the code create a 5×2 plot of images and show examples from each class.

The pixel values are in the range of 0 to 255 for each of the red, green and blue channels.

It’s good practice to work with normalized data.

Because the input values are well understood, we can easily normalize to the range 0 to 1 by dividing each value by the maximum observation which is 255.

Note, the data is loaded as integers, so we must cast it to floating point values in order to perform the division.

The output variables are defined as a vector of integers from 0 to 1 for each class.

 

Let’s start by defining a simple CNN model.

We will use a model with four convolutional layers followed by max pooling and a flattening out of the network to fully connected layers to make predictions:

  1. Convolutional input layer, 32 feature maps with a size of 3×3, a rectifier activation function
  2. Convolutional input layer, 32 feature maps with a size of 3×3, a rectifier activation function
  3. Max Pool layer with size 2×2
  4. Dropout set to 25%
  5. Convolutional input layer, 64 feature maps with a size of 3×3, a rectifier activation function
  6. Convolutional input layer, 64 feature maps with a size of 3×3, a rectifier activation function
  7. Max Pool layer with size 2×2
  8. Dropout set to 25%
  9. Flatten layer
  10. Fully connected layer with 512 units and a rectifier activation function
  11. Dropout set to 50%
  12. Fully connected output layer with 10 units and a softmax activation function

 

A logarithmic loss function is used with the stochastic gradient descent optimization algorithm configured with a large momentum and weight decay start with a learning rate of 0.1.

Then we can fit this model with 100 epochs and a batch size of 32.

The second variant for 6-layer model:

  1. Convolutional input layer, 32 feature maps with a size of 3×3, a rectifier activation function
  2. Dropout set to 20%
  3. Convolutional input layer, 32 feature maps with a size of 3×3, a rectifier activation function
  4. Max Pool layer with size 2×2
  5. Convolutional input layer, 64 feature maps with a size of 3×3, a rectifier activation function
  6. Dropout set to 20%
  7. Convolutional input layer, 64 feature maps with a size of 3×3, a rectifier activation function
  8. Max Pool layer with size 2×2
  9. Convolutional input layer, 128 feature maps with a size of 3×3, a rectifier activation function
  10. Dropout set to 20%
  11. Convolutional input layer, 128 feature maps with a size of 3×3, a rectifier activation function
  12. Max Pool layer with size 2×2
  13. Flatten layer
  14. Dropout set to 20%
  15. Fully connected layer with 1024 units and a rectifier activation function and a weight constraint of max norm set to 3
  16. Dropout set to 20%
  17. Fully connected output layer with 10 units and a softmax activation function

 

In this section, we can visualize model structure. For this problem, we can use a library for Keras for investigating architectures and parameters of sequential models by Piotr Migdał.

First variant for 4-layer:

Second variant for 6-layer:

After training process, we can see loss and accuracy on plots using the code below:

4-layer:

6-layer:

Running this example prints the classification accuracy and loss on the training and test datasets each epoch.

After that, we can print confusion matrix for our example with graphical interpretation.

Confusion matrix – also known as an error matrix, is a specific table layout that allows visualization of the performance of an algorithm, typically a supervised learning one (in unsupervised learning it is usually called a matching matrix). Each row of the matrix represents the instances in a predicted class while each column represents the instances in an actual class (or vice versa).

4-layer confusion matrix and visualizing:

6-layer confusion matrix and visualizing:

 

Comparison Accuracy [%] between 4-layer and 6-layer CNN

As we can see in the chart below, the best accuracy for 4-layer CNN is for epochs between 20-50. For 6-layer CNN is for epochs between 10-20 epochs.

Comparison time of learning process between 4-layer and 6-layer CNN

As we can see in the chart below, the neural network training time is considerably longer for a 6-layer network.

Summary

After working through this tutorial you learned:

  • What is Keras library and how to use it
  • What is Deep Learning
  • How to use ready datasets
  • What is Convolutional Neural Networks(CNN)
  • How to build step by step Convolutional Neural Networks(CNN)
  • What are differences in model results
  • Basics of Machine Learning
  • Introduction to Artificial Intelligence(AI)
  • What is confusion matrix and how to visualize it

If you have any questions about the project or this post, please ask your question in the comments.

You can run the project in your browser or download it from GitHub.

Resources

  1. Official Keras Documentation
  2. About Keras on Wikipedia
  3. About Deep Learning on Wikipedia
  4. Tutorial by Dr. Jason Brownlee
  5. Tutorial by Parneet Kaur
  6. Tutorial by Giuseppe Bonaccorso
Share on Facebook0Share on Google+0Tweet about this on Twitter0Share on LinkedIn0

Leave a Reply

Your email address will not be published. Required fields are marked *