Confusion matrix for machine learning

Share on facebook
Share on twitter
Share on linkedin
Share on telegram
Share on whatsapp

Contents

Confusion matrix: Not so confusing!

Have you been in a situation where you expected your machine learning model to work really well?, but it returned poor precision? You've done all the hard work, then, Where did the classification model go wrong? How can you correct this?

There are many alternatives to measure the performance of your ranking model, but none has stood the test of time like the confusion matrix. Helps us examine how our model worked, where it went wrong and offers us a guide to correct our path.

confusion_matrix-9778829

In this post, we'll explore how a confusion matrix provides a holistic view of your model's performance. And unlike its name, you will notice that a confusion matrix is ​​a fairly simple but powerful concept. So let's unravel the mystery around the confusion matrix!!

matrix_meme-5109420

Learning the ropes in the field of machine learning? These courses will help you follow your path:

This is what we will cover:

  • What is a confusion matrix?
    • True positive
    • True negative
    • False positive: type error 1
    • False negative – Type error 2
  • Why do you need a confusion matrix?
  • Accuracy vs recovery
  • F1 score
  • Matrix of confusion in Scikit-learn
  • Confusion matrix for multiple class classification

What is a confusion matrix?

The million dollar question: What is, after all, a confusion matrix?

A confusion matrix is ​​an N x N matrix that is used to examine the performance of a classification model., where N is the number of target classes. The matrix compares the actual target values ​​with those predicted by the machine learning model. This gives us a holistic view of how well our classification model is performing and what kinds of mistakes it is making..

For a binary classification obstacle, we would have a matrix of 2 x 2 as shown below with 4 values:

basic-confusion-matrix-7967637

Let's decipher the matrix:

  • The target variable has two values: Positive O Negative
  • the columns represent the current values of the target variable
  • the rows represent the predicted values of the target variable

But wait, What is tp, FP, FN and TN here? That is the crucial part of a confusion matrix.. Let's understand each term below.

Understand the true positive, the true negative, the false positive and the false negative in a confusion matrix

True positive (TP)

  • Predicted value matches actual value
  • The true value was positive and the model predicted a positive value.

True negative (TN)

  • Predicted value matches actual value
  • The actual value was negative and the model predicted a negative value.

False positive (FP): type error 1

  • Predicted value was falsely predicted
  • The actual value was negative but the model predicted a positive value
  • Also known as the Type error 1

False negative (FN): type error 2

  • Predicted value was falsely predicted
  • The actual value was positive but the model predicted a negative value
  • Also known as the Type error 2

Let me give you an example to better know this. Suppose we have a classification data set with 1000 data points. We put a classifier on it and get the next confusion matrix:

confusionmatrix-example-7547269

The different values ​​of the confusion matrix would be the following:

  • True positive (TP) = 560; which means that 560 positive class data points were correctly classified by the model
  • True negative (TN) = 330; which means that 330 negative class data points were correctly classified by the model
  • False positive (FP) = 60; which means that the model incorrectly classified 60 data points of negative class as belonging to positive class
  • False negative (FN) = 50; which means that the model incorrectly classified 50 data points of positive class as belonging to negative class

This turned out to be a pretty decent classifier for our data set considering the relatively larger number of true positive and true negative values..

Remember Type Errors 1 and Type 2. Interviewers love to ask the difference between these two!! You can better prepare for all of this from our Online Machine Learning Course

Why do we need a confusion matrix?

Before answering this question, Let's think about a hypothetical ranking hurdle.

Suppose you want to predict how many people are infected with a contagious virus before they show symptoms and isolate them from the healthy population. (Does something still sound? 😷). The two values ​​of our target variable would be: Sick and Not Sick.

Now, you must wonder: Why do we need a confusion matrix when we have our all-weather friend: precision? Good, let's see where precision fails.

Our dataset is an example of unbalanced data set. There is 947 data points for the negative class and 3 data points for the positive class. This is how we will calculate the precision:equation_accuracy-2625498

Let's see how our model worked:

dataset-2294119

The total result values ​​are:

TP = 30, TN = 930, FP = 30, FN = 10

Then, the precision of our model turns out to be:

confusion-matrix_accuracy-2623533

96%! Nothing bad!

But you are giving the wrong idea about the result. Think about it.

Our model says “I can predict sick people the 96% weather”. Despite this, is doing the opposite. You are predicting people who will not get sick with a 96% precision while the sick are spreading the virus!

Do you think this is a correct metric for our model given the severity of the problem? Shouldn't we measure how many positive cases we can correctly predict to stop the spread of the contagious virus? Or maybe, of correctly predicted cases, How many are positive cases to verify the reliability of our model?

This is where we meet the dual concept of Precision and Recall.

Accuracy vs. recovery

Accuracy tells us how many of the correctly predicted cases were truly positive.

Then, explains how to calculate precision:

confusion-matrix_precision-4117048

This would determine if our model is reliable or not..

Recall tells us how many of the actual positive cases we were able to correctly predict with our model.

And this is how we can calculate Recall:

confusion-matrix_recall-6200058

example-confusion-matrix-3672817

We can easily calculate Precision and Recall for our model by plugging in the values ​​in the questions above:

confusion_matrix_precision_recall-4271851

The 50% of the correctly predicted cases turned out to be positive cases. While our model successfully predicted the 75% of the positives. Impressive!

Accuracy is a useful metric in cases where false positives are a greater concern than false negatives.

Accuracy is essential in music or video recommendation systems, e-commerce websites, etc. Incorrect results can lead to customer loss and be detrimental to the company.

Recovery is a useful metric in cases where the false negative trumps the false positive.

Recall is essential in medical cases where it does not matter if we raise a false alarm, But real positive cases should not go unnoticed!!

In our example, Recall would be a better metric because we do not want to accidentally discharge an infected person and let them mix with the healthy population, thus spreading the contagious virus.. Now you can understand why precision was a bad metric for our model.

But there will be cases where there is no clear distinction between whether precision is more important or recovery. What should we do in those cases? We combine them!

F1 score

In practice, when we try to increase the precision of our model, recovery decreases and vice versa. The F1 score captures both trends in a single value:

confusion-matrix-f1-score-1667371

The F1 score is a harmonic mean of precision and recall, so it gives a combined idea about these two metrics. It is maximum when Precision equals Recall.

But there is a catch here. Interpretability of the F1 score is poor. This means that we do not know what is maximizing our classifier: Precision or remember? Then, we use it in combination with other evaluation metrics that give us a complete picture of the result.

Confusion matrix using scikit-learn in Python

You already know the theory, now let's put it into practice. Let's code a confusion matrix with the Scikit-learn library (sklearn) and Python.

sklearn_confusion_matrix-1-2030090

Sklearn has two great functions: confusion matrix() Y classification_report ().

  • Sklearn confusion matrix() returns the values ​​of the confusion matrix. Despite this, the result is slightly different from what we have studied so far. Take the rows as actual values ​​and the columns as predicted values. The rest of the concept remains the same.
  • Sklearn classification_report () generates precision, recovery and f1 score for each target class. At the same time of this, it also has some extra values: micro average, macro average, Y weighted average

Mirco Average is the precision / Recovery / f1 score calculated for all classes.

micro-avg-precision-8402217

Macro media is the average precision / I remember / f1 score.

macro-avg-precision-4143806

Average weight it's just the weighted average of precision / Recovery / f1 score.

Confusion matrix for multiple class classification

How would a confusion matrix work for a multiple class classification obstacle?? Good, Don't scratch your head! We will take a look at that here..

Let's draw a confusion matrix for a multiclass obstacle where we have to predict if a person loves Facebook, Instagram or Snapchat. The confusion matrix would be a matrix of 3 x 3 how are you:

multiclass-confusion-matrix-4737627

The real positive, true negative, false positive and false negative of each class would be calculated by summing the cell values ​​as follows:

multiclass-confusion-matrix-result-6584815

That is all! You are ready to decipher any N x N confusion matrix!!

Final notes

And suddenly, the confusion matrix is ​​no longer so confused! This post should give you a solid foundation on how to interpret and use a confusion matrix for classification algorithms in machine learning..

Soon we will publish a post about the AUC-ROC curve and we will continue our discussion there. Until next time, don't lose hope in your classification model, You may be using the wrong evaluation metric!

Subscribe to our Newsletter

We will not send you SPAM mail. We hate it as much as you.