This article was published as part of the Data Science Blogathon
The various deep learning methods use data to train neural network algorithms to perform a variety of machine learning tasks., as the classification of different classes of objects. Convolutional neural networks are very powerful deep learning algorithms for image analysis. This article will explain how to build, train and evaluate convolutional neural networks.
You will also learn how to improve your ability to learn from data and how to interpret training results.. Deep Learning has several applications such as image processing, natural language processing, etc. It is also used in Medical Sciences, Media and Entertainment, Autonomous Cars, etc.
What is CNN?
CNN is a powerful algorithm for image processing. These algorithms are currently the best algorithms we have for automated image processing.. Many companies use these algorithms to do things like identify objects in an image.
Images contain RGB combination data. Matplotlib can be used to import an image into memory from a file. The computer does not see an image, all you see is an array of numbers. Color images are stored in three-dimensional arrays. The first two dimensions correspond to the height and width of the image (the number of pixels). The last dimension corresponds to the red colors, green and blue present in every pixel.
Three layers of CNN
Specialized convolutional neural networks for image and video recognition applications. CNN is mainly used in image analysis tasks such as image recognition, object detection and segmentation.
There are three types of layers in convolutional neural networks:
1) convolutional cover: in a typical neural network, each input neuron is connected to the next hidden layer. And CNN, only a small region of the input layer neurons connects to the hidden layer of neurons.
2) Grouping layer: the grouping layer is used to reduce the dimensionality of the feature map. There will be multiple layers of activation and grouping within the hidden layer of CNN.
3) Fully connected layer: Fully connected layers form the last covers In the net. The entrance to the fully connected layer is the output of the final grouping or convolution Front cover, which is flattened and then introduced into the fully connected layer.
MNIST data set
In this article, we will work on object recognition in image data using the MNIST data set for handwritten digit recognition.
The MNIST dataset consists of digit images from a variety of scanned documents. Each image is a square of 28 x 28 pixels. In this data set, are used 60.000 images to train the model and 10.000 pictures to test the model. There is 10 digits (0 a 9) O 10 classes to predict.
Loading the MNIST dataset
Install the TensorFlow library and import the dataset as a training and test dataset.
Plot the image sample output
!pip install tensorflow from keras.datasets import mnist import matplotlib.pyplot as plt (X_train,y_train), (X_test, y_test)= mnist.load_data() plt.subplot() plt.imshow(X_train[9], cmap=plt.get_cmap('gray'))
Production:
Deep learning model with multilayer perceptrons using MNIST
In this model, we will create a simple neural network model with a single hidden layer for the MNIST dataset for handwritten digit recognition.
A perceptron is a single neuron model that is the building block of the largest neural networks. The multilayer perceptron consists of three layers, namely, the input layer, the hidden layer and the output layer. The hidden layer is not visible to the outside world. Only the input layer and the output layer are visible. For all DL models, data must be numerical in nature.
Paso 1: import key libraries
import numpy as np from keras.models import Sequential from keras.layers import Dense from keras.utils import np_utils
Paso 2: reshape the data
Each image is 28X28 in size, so there is 784 pixels. Then, the output layer has 10 Departures, the hidden layer has 784 neurons and the input layer has 784 tickets. Later, the dataset is converted to a float data type.
number_pix=X_train.shape[1]*X_train.shape[2] X_train=X_train.reshape(X_train.shape[0], number_pix).astype('float32') X_test=X_test.reshape(X_test.shape[0], number_pix).astype('float32')
Paso 3: normalize data
NN models generally require scaled data. In this code snippet, data is normalized from (0-255) a (0-1) and the target variable is encoded in a single use for further analysis. The target variable has a total of 10 lessons (0-9)
X_train=X_train/255 X_test=X_test/255 y_train= np_utils.to_categorical(y_train) y_test= np_utils.to_categorical(y_test) num_classes=y_train.shape[1] print(num_classes)
Production:
10
Now, we will create a function NN_model and compile it
Paso 4: define model function
def nn_model(): model=Sequential() model.add(Dense(number_pix, input_dim=number_pix, activation = 'reread')) mode.add(Dense(num_classes, activation='softmax')) model.compile(loss="categorical_crossentropy", optimize ="Adam", metrics=['accuracy']) return model
There are two layers, one is a hidden layer with the ReLu trigger function and the other is the output layer that uses the softmax function.
Paso 5: run the model
model=nn_model() model.fit(X_train, y_train, validation_data=(X_test,y_test),epochs=10, batch_size=200, verbose=2) score= model.evaluate(X_test, y_test, verbose=0) print('The error is: %.2f%%'%(100-score[1]*100))
Production:
Epoch 1/10 300/300 - 11s - loss: 0.2778 - accuracy: 0.9216 - val_loss: 0.1397 - val_accuracy: 0.9604 Epoch 2/10 300/300 - 2s - loss: 0.1121 - accuracy: 0.9675 - val_loss: 0.0977 - val_accuracy: 0.9692 Epoch 3/10 300/300 - 2s - loss: 0.0726 - accuracy: 0.9790 - val_loss: 0.0750 - val_accuracy: 0.9778 Epoch 4/10 300/300 - 2s - loss: 0.0513 - accuracy: 0.9851 - val_loss: 0.0656 - val_accuracy: 0.9796 Epoch 5/10 300/300 - 2s - loss: 0.0376 - accuracy: 0.9892 - val_loss: 0.0717 - val_accuracy: 0.9773 Epoch 6/10 300/300 - 2s - loss: 0.0269 - accuracy: 0.9928 - val_loss: 0.0637 - val_accuracy: 0.9797 Epoch 7/10 300/300 - 2s - loss: 0.0208 - accuracy: 0.9948 - val_loss: 0.0600 - val_accuracy: 0.9824 Epoch 8/10 300/300 - 2s - loss: 0.0153 - accuracy: 0.9962 - val_loss: 0.0581 - val_accuracy: 0.9815 Epoch 9/10 300/300 - 2s - loss: 0.0111 - accuracy: 0.9976 - val_loss: 0.0631 - val_accuracy: 0.9807 Epoch 10/10 300/300 - 2s - loss: 0.0082 - accuracy: 0.9985 - val_loss: 0.0609 - val_accuracy: 0.9828 The error is: 1.72%
In the model results, is visible as the number of epochs increases, improves accuracy. The error is from 1,72%, minor is the error, the higher the accuracy of the model.
Convolutional neural network model using MNIST
In this section, we will create simple CNN models for MNIST that demonstrate convolutional layers, grouping layers and dropout layers.
Paso 1: import all necessary libraries
import numpy as np from keras.models import Sequential from keras.layers import Dense from keras.utils import np_utils from keras.layers import Dropout from keras.layers import Flatten from keras.layers.convolutional import Conv2D from keras.layers.convolutional import MaxPooling2D
Paso 2: Set up the seed for reproducibility and load the MNIST data
seed=10 np.random.seed(seed) (X_train,y_train), (X_test, y_test)= mnist.load_data()
Paso 3: convert data to float values
X_train=X_train.reshape(X_train.shape[0], 1,28,28).astype('float32') X_test=X_test.reshape(X_test.shape[0], 1,28,28).astype('float32')
Paso 4: normalize data
X_train=X_train/255 X_test=X_test/255 y_train= np_utils.to_categorical(y_train) y_test= np_utils.to_categorical(y_test) num_classes=y_train.shape[1] print(num_classes)
A classic CNN architecture looks like below:
Output layer (10 Departures) |
Hidden cloak (128 neurons) |
Flat layer |
Cloak of Abandonment 20% |
Maximum grouping layer 2 × 2 |
convolutional cover 32 maps, 5 × 5 |
Visible layer 1x28x28 |
The first hidden layer is a convolutional layer called Convolution2D. Dispose of 32 feature maps with size 5 × 5 and grinding function. This is the input layer. The following is the pooling layer that takes the maximum value called MaxPooling2D. In this model, is configured as a pool size of 2 × 2.
Regularization occurs in the abandonment layer. It is set to randomly exclude the 20% layer neurons to avoid overfitting. The fifth layer is the flattened layer that converts the 2D matrix data into a vector called Flatten. Allows the output to be fully processed by a fully connected standard layer.
Then, the fully connected layer is used with 128 neurons and rectifier activation function. Finally, the output layer has 10 neurons for 10 classes and a softmax trigger function to generate probability-like predictions for each class.
Paso 5: run the model
def cnn_model(): model=Sequential() model.add(Conv2D(32,5,5, padding='same',input_shape=(1,28,28), activation = 'reread')) model.add(MaxPooling2D(pool_size=(2,2), padding='same')) model.add(Dropout(0.2)) model.add(Flatten()) model.add(Dense(128, activation = 'reread')) model.add(Dense(num_classes, activation='softmax')) model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=['accuracy']) return model
model=cnn_model() model.fit(X_train, y_train, validation_data=(X_test,y_test),epochs=10, batch_size=200, verbose=2) score= model.evaluate(X_test, y_test, verbose=0) print('The error is: %.2f%%'%(100-score[1]*100))
Production:
Epoch 1/10 300/300 - 2s - loss: 0.7825 - accuracy: 0.7637 - val_loss: 0.3071 - val_accuracy: 0.9069 Epoch 2/10 300/300 - 1s - loss: 0.3505 - accuracy: 0.8908 - val_loss: 0.2192 - val_accuracy: 0.9336 Epoch 3/10 300/300 - 1s - loss: 0.2768 - accuracy: 0.9126 - val_loss: 0.1771 - val_accuracy: 0.9426 Epoch 4/10 300/300 - 1s - loss: 0.2392 - accuracy: 0.9251 - val_loss: 0.1508 - val_accuracy: 0.9537 Epoch 5/10 300/300 - 1s - loss: 0.2164 - accuracy: 0.9325 - val_loss: 0.1423 - val_accuracy: 0.9546 Epoch 6/10 300/300 - 1s - loss: 0.1997 - accuracy: 0.9380 - val_loss: 0.1279 - val_accuracy: 0.9607 Epoch 7/10 300/300 - 1s - loss: 0.1856 - accuracy: 0.9415 - val_loss: 0.1179 - val_accuracy: 0.9632 Epoch 8/10 300/300 - 1s - loss: 0.1777 - accuracy: 0.9433 - val_loss: 0.1119 - val_accuracy: 0.9642 Epoch 9/10 300/300 - 1s - loss: 0.1689 - accuracy: 0.9469 - val_loss: 0.1093 - val_accuracy: 0.9667 Epoch 10/10 300/300 - 1s - loss: 0.1605 - accuracy: 0.9493 - val_loss: 0.1053 - val_accuracy: 0.9659 The error is: 3.41%
In the model results, is visible as the number of epochs increases, improves accuracy. The mistake is 3.41%, lower error higher model precision.
Hope you enjoyed reading and feel free to use my code to test it for your purposes. What's more, if there is any comment on the code or just the blog post, feel free to contact me at [email protected]
Media shown in this CNN image processing article is not the property of DataPeaker and is used at the author's discretion.