Image processing with CNN | Beginner's Guide to Image Processing

Share on facebook
Share on twitter
Share on linkedin
Share on telegram
Share on whatsapp

Contents

This article was published as part of the Data Science Blogathon

The various deep learning methods use data to train neural network algorithms to perform a variety of machine learning tasks., as the classification of different classes of objects. Convolutional neural networks are very powerful deep learning algorithms for image analysis. This article will explain how to build, train and evaluate convolutional neural networks.

You will also learn how to improve your ability to learn from data and how to interpret training results.. Deep Learning has several applications such as image processing, natural language processing, etc. It is also used in Medical Sciences, Media and Entertainment, Autonomous Cars, etc.

What is CNN?

CNN is a powerful algorithm for image processing. These algorithms are currently the best algorithms we have for automated image processing.. Many companies use these algorithms to do things like identify objects in an image.

Images contain RGB combination data. Matplotlib can be used to import an image into memory from a file. The computer does not see an image, all you see is an array of numbers. Color images are stored in three-dimensional arrays. The first two dimensions correspond to the height and width of the image (the number of pixels). The last dimension corresponds to the red colors, green and blue present in every pixel.

Three layers of CNN

Specialized convolutional neural networks for image and video recognition applications. CNN is mainly used in image analysis tasks such as image recognition, object detection and segmentation.

There are three types of layers in convolutional neural networks:

1) convolutional cover: in a typical neural network, each input neuron is connected to the next hidden layer. And CNN, only a small region of the input layer neurons connects to the hidden layer of neurons.

2) Grouping layer: the grouping layer is used to reduce the dimensionality of the feature map. There will be multiple layers of activation and grouping within the hidden layer of CNN.

3) Fully connected layer: Fully connected layers form the last covers In the net. The entrance to the fully connected layer is the output of the final grouping or convolution Front cover, which is flattened and then introduced into the fully connected layer.

60350cnn-1-7990985

Source: Google images

MNIST data set

In this article, we will work on object recognition in image data using the MNIST data set for handwritten digit recognition.

The MNIST dataset consists of digit images from a variety of scanned documents. Each image is a square of 28 x 28 pixels. In this data set, are used 60.000 images to train the model and 10.000 pictures to test the model. There is 10 digits (0 a 9) O 10 classes to predict.

20748screenshot202021-06-1820at207-18-2520pm-5492953

Source: Google images


Loading the MNIST dataset

Install the TensorFlow library and import the dataset as a training and test dataset.

Plot the image sample output

!pip install tensorflow
from keras.datasets import mnist
import matplotlib.pyplot as plt
(X_train,y_train), (X_test, y_test)= mnist.load_data()
plt.subplot()
plt.imshow(X_train[9], cmap=plt.get_cmap('gray'))

Production:

222-5863504

Deep learning model with multilayer perceptrons using MNIST

In this model, we will create a simple neural network model with a single hidden layer for the MNIST dataset for handwritten digit recognition.

A perceptron is a single neuron model that is the building block of the largest neural networks. The multilayer perceptron consists of three layers, namely, the input layer, the hidden layer and the output layer. The hidden layer is not visible to the outside world. Only the input layer and the output layer are visible. For all DL models, data must be numerical in nature.

Paso 1: import key libraries

import numpy as np
from keras.models import Sequential
from keras.layers import Dense
from keras.utils import np_utils

Paso 2: reshape the data

Each image is 28X28 in size, so there is 784 pixels. Then, the output layer has 10 Departures, the hidden layer has 784 neurons and the input layer has 784 tickets. Later, the dataset is converted to a float data type.

number_pix=X_train.shape[1]*X_train.shape[2] 
X_train=X_train.reshape(X_train.shape[0], number_pix).astype('float32')
X_test=X_test.reshape(X_test.shape[0], number_pix).astype('float32')

Paso 3: normalize data

NN models generally require scaled data. In this code snippet, data is normalized from (0-255) a (0-1) and the target variable is encoded in a single use for further analysis. The target variable has a total of 10 lessons (0-9)

X_train=X_train/255
X_test=X_test/255
y_train= np_utils.to_categorical(y_train)
y_test= np_utils.to_categorical(y_test)
num_classes=y_train.shape[1]
print(num_classes)

Production:

10

Now, we will create a function NN_model and compile it

Paso 4: define model function

def nn_model():
    model=Sequential()
    model.add(Dense(number_pix, input_dim=number_pix, activation = 'reread'))
    mode.add(Dense(num_classes, activation='softmax'))
    model.compile(loss="categorical_crossentropy", optimize ="Adam", metrics=['accuracy'])
    return model

There are two layers, one is a hidden layer with the ReLu trigger function and the other is the output layer that uses the softmax function.

Paso 5: run the model

model=nn_model()
model.fit(X_train, y_train, validation_data=(X_test,y_test),epochs=10, batch_size=200, verbose=2)
score= model.evaluate(X_test, y_test, verbose=0)
print('The error is: %.2f%%'%(100-score[1]*100))

Production:

Epoch 1/10
300/300 - 11s - loss: 0.2778 - accuracy: 0.9216 - val_loss: 0.1397 - val_accuracy: 0.9604
Epoch 2/10
300/300 - 2s - loss: 0.1121 - accuracy: 0.9675 - val_loss: 0.0977 - val_accuracy: 0.9692
Epoch 3/10
300/300 - 2s - loss: 0.0726 - accuracy: 0.9790 - val_loss: 0.0750 - val_accuracy: 0.9778
Epoch 4/10
300/300 - 2s - loss: 0.0513 - accuracy: 0.9851 - val_loss: 0.0656 - val_accuracy: 0.9796
Epoch 5/10
300/300 - 2s - loss: 0.0376 - accuracy: 0.9892 - val_loss: 0.0717 - val_accuracy: 0.9773
Epoch 6/10
300/300 - 2s - loss: 0.0269 - accuracy: 0.9928 - val_loss: 0.0637 - val_accuracy: 0.9797
Epoch 7/10
300/300 - 2s - loss: 0.0208 - accuracy: 0.9948 - val_loss: 0.0600 - val_accuracy: 0.9824
Epoch 8/10
300/300 - 2s - loss: 0.0153 - accuracy: 0.9962 - val_loss: 0.0581 - val_accuracy: 0.9815
Epoch 9/10
300/300 - 2s - loss: 0.0111 - accuracy: 0.9976 - val_loss: 0.0631 - val_accuracy: 0.9807
Epoch 10/10
300/300 - 2s - loss: 0.0082 - accuracy: 0.9985 - val_loss: 0.0609 - val_accuracy: 0.9828
The error is: 1.72%

In the model results, is visible as the number of epochs increases, improves accuracy. The error is from 1,72%, minor is the error, the higher the accuracy of the model.

Convolutional neural network model using MNIST

In this section, we will create simple CNN models for MNIST that demonstrate convolutional layers, grouping layers and dropout layers.

Paso 1: import all necessary libraries

import numpy as np
from keras.models import Sequential
from keras.layers import Dense
from keras.utils import np_utils
from keras.layers import Dropout
from keras.layers import Flatten
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D

Paso 2: Set up the seed for reproducibility and load the MNIST data

seed=10
np.random.seed(seed)
(X_train,y_train), (X_test, y_test)= mnist.load_data()

Paso 3: convert data to float values

X_train=X_train.reshape(X_train.shape[0], 1,28,28).astype('float32')
X_test=X_test.reshape(X_test.shape[0], 1,28,28).astype('float32')

Paso 4: normalize data

X_train=X_train/255
X_test=X_test/255
y_train= np_utils.to_categorical(y_train)
y_test= np_utils.to_categorical(y_test)
num_classes=y_train.shape[1]
print(num_classes)

A classic CNN architecture looks like below:

63838cnntoday-3782857

Source: Google images

Output layer
(10 Departures)
Hidden cloak
(128 neurons)
Flat layer
Cloak of Abandonment
20%
Maximum grouping layer
2 × 2
convolutional cover
32 maps, 5 × 5
Visible layer
1x28x28

The first hidden layer is a convolutional layer called Convolution2D. Dispose of 32 feature maps with size 5 × 5 and grinding function. This is the input layer. The following is the pooling layer that takes the maximum value called MaxPooling2D. In this model, is configured as a pool size of 2 × 2.

Regularization occurs in the abandonment layer. It is set to randomly exclude the 20% layer neurons to avoid overfitting. The fifth layer is the flattened layer that converts the 2D matrix data into a vector called Flatten. Allows the output to be fully processed by a fully connected standard layer.

Then, the fully connected layer is used with 128 neurons and rectifier activation function. Finally, the output layer has 10 neurons for 10 classes and a softmax trigger function to generate probability-like predictions for each class.

Paso 5: run the model

def cnn_model():
    model=Sequential()
    model.add(Conv2D(32,5,5, padding='same',input_shape=(1,28,28), activation = 'reread'))
    model.add(MaxPooling2D(pool_size=(2,2), padding='same'))
    model.add(Dropout(0.2))
    model.add(Flatten())
    model.add(Dense(128, activation = 'reread'))
    model.add(Dense(num_classes, activation='softmax'))
    model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=['accuracy'])
    return model
model=cnn_model()
model.fit(X_train, y_train, validation_data=(X_test,y_test),epochs=10, batch_size=200, verbose=2)
score= model.evaluate(X_test, y_test, verbose=0)
print('The error is: %.2f%%'%(100-score[1]*100))

Production:

Epoch 1/10
300/300 - 2s - loss: 0.7825 - accuracy: 0.7637 - val_loss: 0.3071 - val_accuracy: 0.9069
Epoch 2/10
300/300 - 1s - loss: 0.3505 - accuracy: 0.8908 - val_loss: 0.2192 - val_accuracy: 0.9336
Epoch 3/10
300/300 - 1s - loss: 0.2768 - accuracy: 0.9126 - val_loss: 0.1771 - val_accuracy: 0.9426
Epoch 4/10
300/300 - 1s - loss: 0.2392 - accuracy: 0.9251 - val_loss: 0.1508 - val_accuracy: 0.9537
Epoch 5/10
300/300 - 1s - loss: 0.2164 - accuracy: 0.9325 - val_loss: 0.1423 - val_accuracy: 0.9546
Epoch 6/10
300/300 - 1s - loss: 0.1997 - accuracy: 0.9380 - val_loss: 0.1279 - val_accuracy: 0.9607
Epoch 7/10
300/300 - 1s - loss: 0.1856 - accuracy: 0.9415 - val_loss: 0.1179 - val_accuracy: 0.9632
Epoch 8/10
300/300 - 1s - loss: 0.1777 - accuracy: 0.9433 - val_loss: 0.1119 - val_accuracy: 0.9642
Epoch 9/10
300/300 - 1s - loss: 0.1689 - accuracy: 0.9469 - val_loss: 0.1093 - val_accuracy: 0.9667
Epoch 10/10
300/300 - 1s - loss: 0.1605 - accuracy: 0.9493 - val_loss: 0.1053 - val_accuracy: 0.9659
The error is: 3.41%

In the model results, is visible as the number of epochs increases, improves accuracy. The mistake is 3.41%, lower error higher model precision.

Hope you enjoyed reading and feel free to use my code to test it for your purposes. What's more, if there is any comment on the code or just the blog post, feel free to contact me at [email protected]

Media shown in this CNN image processing article is not the property of DataPeaker and is used at the author's discretion.

Subscribe to our Newsletter

We will not send you SPAM mail. We hate it as much as you.