Image Classification Using CNN: python implementation

Contents

This article was published as part of the Data Science Blogathon

Hi guys! In this blog, I will discuss everything about image classification.

In recent years, Deep Learning has proven to be a very powerful tool due to its ability to handle large amounts of data. The use of hidden layers exceeds traditional techniques, especially for pattern recognition. One of the most popular deep neural networks is convolutional neural networks (CNN).

46151deep_learning_software_de_1380x735px_1150x_-8179956

A convolutional neural network (CNN) it's a kind of Red neuronal artificial (ANN) used in image recognition and processing, which is specially designed to process data (pixels).

87612082918_1325_convnetconv1-3625425

Image source: Google.com

Before moving on, we must understand what the neural network is. Let's go…

Red neuronal:

A neural network is built from several interconnected nodes called “Neurons”. Neurons are arranged in input layer, hidden layer and output layer. The input layer corresponds to our predictors / characteristics and the output layer to our response variables.

27250neural-network-1719512

Image source: Google.com

Multilayer Perceptron (MLP):

The neural network with an input layer, one or more hidden layers and an output layer is called multilayer perceptron (MLP). MLP is invented by Frank Rosenblatt In the year of 1957. MLP shown below has 5 input nodes, 5 hidden nodes with two hidden layers and one exit node

49705nn-7229061

Image source: Google.com

How does this neural network?

– The neurons of the input layer receive incoming information from the data that they process and distribute to the hidden layers.

– That information, at the same time, is processed by hidden layers and passed to the output. neurons.

– The information from this artificial neural network (ANN) is processed in terms of a wake function. This function actually mimics neurons in the brain.

– Each neuron contains a value of trigger functions and a threshold value.

– The threshold value is the minimum value that the input must have in order for it to be activated.

– The task of the neuron is to perform a weighted sum of all the input signals and apply the activation function on the sum before passing it to the next layer. (hidden or exit).

Let's understand what the weighting sum is.

Let's say we have values ​​𝑎1, 𝑎2, 𝑎3, 𝑎4 for input and weights as 𝑤1, 𝑤2, 𝑤3, 𝑤4 as the input to one of the hidden layer neurons, let's say 𝑛𝑗, then the weighted sum is represented as

𝑆𝑗 = σ 𝑖 = 1to4 𝑤𝑖 * 𝑎𝑖 + 𝑏𝑗

where 𝑏𝑗: bias due to node

33975ws-2271857

Image source: Google.com

What are the activation functions?

These functions are necessary to introduce a non-linearity in the network. The trigger function is applied and that output is passed to the next layer.

* Possible functions *

• Sigmoide: sigmoid function is differentiable. Produces an output between 0 Y 1.

• Hyperbolic tangent: The hyperbolic tangent is also differentiable. This produces an output between -1 Y 1.

• ReLU: ReLU is the most popular function. ReLU is widely used in deep learning.

• Softmax: the softmax function is used for multiple class classification problems. It is a generalization of the sigmoid function. It also produces an output between 0 Y 1

Now, let's go with our CNN theme …

CNN:

Now imagine there is a picture of a bird, and you want to identify it if it really is a bird or something else. The first thing to do is feed the image pixels in the form of arrays to the input layer of the neural network (MLP networks are used to classify such things). Hidden layers carry feature extraction by performing various calculations and operations. There are several hidden layers like convolution, the ReLU and the grouping layer that performs feature extraction from your image. Then, Finally, there is a fully connected layer that you can see that identifies the exact object in the image. You can understand very easily in the following figure:

20451cnn3-7880418

Image source: Google.com

Convolution:-

The convolution operation involves matrix arithmetic operations and each image is represented as an array of values (pixels).

Let's understand the example:

a = [2,5,8,4,7,9]

b = [1,2,3]

In the convolution operation, matrices are multiplied one by one in terms of elements, and the product is grouped or summed to create a new matrix representing a * b.

The first three elements of the array a now multiply by the elements of the array B. The product is added to obtain the result and is stored in a new matrix of a * b.

This process remains continuous until the operation is completed..

29760cnn-2359666

Image source: Google.com

Grouping:

After the convolution, there is another operation called grouping. Then, In the chain, convolution and grouping are applied sequentially on the data in order to extract some characteristics from the data. After Sequential Clustered and Convolutional Layers, data is flattened
in a feedback neural network which is also called a multilayer perceptron.

80513understanding20convolutional20neural20networks20through20visualizations20pytorch20image20classifier-9877439

Image source: Google.com

So far, we've seen concepts that are important to our CNN building model.

Now we'll move forward to see a CNN case study.

1) Here we are going to import the necessary libraries that are required to perform CNN tasks.

import NumPy as np
%matplotlib inline
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
import TensorFlow as tf
tf.compat.v1.set_random_seed(2019)

2) Here we require the following code to form the CNN model

model = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(16,(3,3),activation = "resume" , input_shape = (180,180,3)) ,
    tf.keras.layers.MaxPooling2D(2,2),
    tf.keras.layers.Conv2D(32,(3,3),activation = "resume") ,  
    tf.keras.layers.MaxPooling2D(2,2),
    tf.keras.layers.Conv2D(64,(3,3),activation = "resume") ,  
    tf.keras.layers.MaxPooling2D(2,2),
    tf.keras.layers.Conv2D(128,(3,3),activation = "resume"),  
    tf.keras.layers.MaxPooling2D(2,2),
    tf.keras.layers.Flatten(), 
    tf.hard.layers.Dense(550,activation="resume"),      #Adding the Hidden layer
    tf.keras.layers.Dropout(0.1,seed = 2019),
    tf.hard.layers.Dense(400,activation ="resume"),
    tf.keras.layers.Dropout(0.3,seed = 2019),
    tf.hard.layers.Dense(300,activation="resume"),
    tf.keras.layers.Dropout(0.4,seed = 2019),
    tf.hard.layers.Dense(200,activation ="resume"),
    tf.keras.layers.Dropout(0.2,seed = 2019),
    tf.hard.layers.Dense(5,activation = "softmax")   #Adding the Output Layer
])

A convoluted image may be too large and, Thus, shrinks without losing features or patterns, so the grouping is done.

Here, To create a neural network is to initialize the network using the Keras sequential model.

Flatten (): flattening transforms a two-dimensional array of features into a vector of features.

3) Now let's look at a summary of CNN's model

model.summary()

You will print the following output

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 178, 178, 16)      448       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 89, 89, 16)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 87, 87, 32)        4640      
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 43, 43, 32)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 41, 41, 64)        18496     
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 20, 20, 64)        0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 18, 18, 128)       73856     
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 9, 9, 128)         0         
_________________________________________________________________
flatten (Flatten)            (None, 10368)             0         
_________________________________________________________________
dense (Dense)                (None, 550)               5702950   
_________________________________________________________________
dropout (Dropout)            (None, 550)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 400)               220400    
_________________________________________________________________
dropout_1 (Dropout)          (None, 400)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 300)               120300    
_________________________________________________________________
dropout_2 (Dropout)          (None, 300)               0         
_________________________________________________________________
dense_3 (Dense)              (None, 200)               60200     
_________________________________________________________________
dropout_3 (Dropout)          (None, 200)               0         
_________________________________________________________________
dense_4 (Dense)              (None, 5)                 1005      
=================================================================
Total params: 6,202,295
Trainable params: 6,202,295
Non-trainable params: 0

4) So now we are obliged to specify optimizers.

from tensorflow.keras.optimizers import RMSprop,SGD,Adam
adam=Adam(lr=0.001)
model.compile(optimizer="adam", loss="categorical_crossentropy", metrics = ['acc'])

The optimizer is used to reduce the calculated cost per cross-entropy

the loss function is used to calculate the error.

The term metrics is used to represent the efficiency of the model.

5) In this step, we will see how to configure the data directory and generate image data.

bs=30         #Setting batch size
train_dir = "D:/Data Science/Image Datasets/FastFood/train/"   #Setting training directory
validation_dir = "D:/Data Science/Image Datasets/FastFood/test/"   #Setting testing directory
from tensorflow.keras.preprocessing.image import ImageDataGenerator 
# All images will be rescaled by 1./255.
train_datagen = ImageDataGenerator( rescale = 1.0/255. )
test_datagen = ImageDataGenerator( rescale = 1.0/255. )
# Flow training images in batches of 20 using train_datagen generator
#Flow_from_directory function lets the classifier directly identify the labels from the name of the directories the image lies in
train_generator=train_datagen.flow_from_directory(train_dir,batch_size=bs,class_mode="categorical",target_size=(180,180))
# Flow validation images in batches of 20 using test_datagen generator
validation_generator =  test_datagen.flow_from_directory(validation_dir,
                                                         batch_size=bs,
                                                         class_mode="categorical",
                                                         target_size=(180,180))

La salida será:

Found 1465 images belonging to 5 classes.
Found 893 images belonging to 5 classes.

6) Paso final del modelo de ajuste.

history = model.fit(train_generator,
                    validation_data=validation_generator,
                    steps_per_epoch=150 // bs,
                    epochs=30,
                    validation_steps=50 // bs,
                    verbose=2)

La salida será:

Epoch 1/30
5/5 - 4s - loss: 0.8625 - acc: 0.6933 - val_loss: 1.1741 - val_acc: 0.5000
Epoch 2/30
5/5 - 3s - loss: 0.7539 - acc: 0.7467 - val_loss: 1.2036 - val_acc: 0.5333
Epoch 3/30
5/5 - 3s - loss: 0.7829 - acc: 0.7400 - val_loss: 1.2483 - val_acc: 0.5667
Epoch 4/30
5/5 - 3s - loss: 0.6823 - acc: 0.7867 - val_loss: 1.3290 - val_acc: 0.4333
Epoch 5/30
5/5 - 3s - loss: 0.6892 - acc: 0.7800 - val_loss: 1.6482 - val_acc: 0.4333
Epoch 6/30
5/5 - 3s - loss: 0.7903 - acc: 0.7467 - val_loss: 1.0440 - val_acc: 0.6333
Epoch 7/30
5/5 - 3s - loss: 0.5731 - acc: 0.8267 - val_loss: 1.5226 - val_acc: 0.5000
Epoch 8/30
5/5 - 3s - loss: 0.5949 - acc: 0.8333 - val_loss: 0.9984 - val_acc: 0.6667
Epoch 9/30
5/5 - 3s - loss: 0.6162 - acc: 0.8069 - val_loss: 1.1490 - val_acc: 0.5667
Epoch 10/30
5/5 - 3s - loss: 0.7509 - acc: 0.7600 - val_loss: 1.3168 - val_acc: 0.5000
Epoch 11/30
5/5 - 4s - loss: 0.6180 - acc: 0.7862 - val_loss: 1.1918 - val_acc: 0.7000
Epoch 12/30
5/5 - 3s - loss: 0.4936 - acc: 0.8467 - val_loss: 1.0488 - val_acc: 0.6333
Epoch 13/30
5/5 - 3s - loss: 0.4290 - acc: 0.8400 - val_loss: 0.9400 - val_acc: 0.6667
Epoch 14/30
5/5 - 3s - loss: 0.4205 - acc: 0.8533 - val_loss: 1.0716 - val_acc: 0.7000
Epoch 15/30
5/5 - 4s - loss: 0.5750 - acc: 0.8067 - val_loss: 1.2055 - val_acc: 0.6000
Epoch 16/30
5/5 - 4s - loss: 0.4080 - acc: 0.8533 - val_loss: 1.5014 - val_acc: 0.6667
Epoch 17/30
5/5 - 3s - loss: 0.3686 - acc: 0.8467 - val_loss: 1.0441 - val_acc: 0.5667
Epoch 18/30
5/5 - 3s - loss: 0.5474 - acc: 0.8067 - val_loss: 0.9662 - val_acc: 0.7333
Epoch 19/30
5/5 - 3s - loss: 0.5646 - acc: 0.8138 - val_loss: 0.9151 - val_acc: 0.7000
Epoch 20/30
5/5 - 4s - loss: 0.3579 - acc: 0.8800 - val_loss: 1.4184 - val_acc: 0.5667
Epoch 21/30
5/5 - 3s - loss: 0.3714 - acc: 0.8800 - val_loss: 2.0762 - val_acc: 0.6333
Epoch 22/30
5/5 - 3s - loss: 0.3654 - acc: 0.8933 - val_loss: 1.8273 - val_acc: 0.5667
Epoch 23/30
5/5 - 3s - loss: 0.3845 - acc: 0.8933 - val_loss: 1.0199 - val_acc: 0.7333
Epoch 24/30
5/5 - 3s - loss: 0.3356 - acc: 0.9000 - val_loss: 0.5168 - val_acc: 0.8333
Epoch 25/30
5/5 - 3s - loss: 0.3612 - acc: 0.8667 - val_loss: 1.7924 - val_acc: 0.5667
Epoch 26/30
5/5 - 3s - loss: 0.3075 - acc: 0.8867 - val_loss: 1.0720 - val_acc: 0.6667
Epoch 27/30
5/5 - 3s - loss: 0.2820 - acc: 0.9400 - val_loss: 2.2798 - val_acc: 0.5667
Epoch 28/30
5/5 - 3s - loss: 0.3606 - acc: 0.8621 - val_loss: 1.2423 - val_acc: 0.8000
Epoch 29/30
5/5 - 3s - loss: 0.2630 - acc: 0.9000 - val_loss: 1.4235 - val_acc: 0.6333
Epoch 30/30
5/5 - 3s - loss: 0.3790 - acc: 0.9000 - val_loss: 0.6173 - val_acc: 0.8000

La función anterior entrena la red neuronal utilizando el conjunto de entrenamiento y evalúa su rendimiento en el conjunto de prueba. Las funciones devuelven dos métricas para cada época ‘acc’ y ‘val_acc’ which are the precision of the predictions obtained in the training set and the precision achieved in the test set, respectively.

Conclution:

Therefore, we see that it has been met with sufficient precision. But nevertheless, anyone can run this model by increasing the number of epochs or any other parameter.

I hope you liked my article. Share with your friends, colleagues.

The media shown in this article is not the property of DataPeaker and is used at the author's discretion.

Subscribe to our Newsletter

We will not send you SPAM mail. We hate it as much as you.