Big Data

Decision tree algorithm for classification: machine learning 101

This article was published as part of the Data Science Blogathon.

Overview

Learn more about the Decision Tree Algorithm in Machine Learning for Classification Problems.
here we have covered entropy, information gain and Gini impurity

Decision tree algorithm

algorithms. That can be used for both a classification problem and a regression problem.

The goal of this algorithm is to create a model that predicts the value of a target variable, for which the decision tree uses the representation of the tree to solve the problem in which the leaf node corresponds to a class label and the attributes are represented in the internal node. From the tree.

Let's take a sample data set to go further ....

Suppose we have a sample of 14 patient data sets and we have to predict which drug to suggest to patient A or B.

Let's say we choose cholesterol as the first attribute to divide the data

It will divide our data into two branches High and Normal according to cholesterol, as you can see in the figure above.

Suppose our new patient has high cholesterol from the above division of our data that we cannot tell either Drug B or Drug A will be appropriate for the patient.

What's more, if the patient's cholesterol is normal, we do not yet have an idea or information to determine if drug A or drug B is suitable for the patient.

Let's take another attribute age, as we can see, age has three categories: Young man, middle age and older, let's try to divide.

From the previous figure, now we can say that we can easily predict which drug to administer to a patient based on their reports.

Assumptions we make when using the decision tree:

– In the beginning, we consider the entire training set as the root.

-Characteristic values are preferred to be categorical, if the values continue, are converted to discrete before building the model.

-Based on attribute values, records are distributed recursively.

-We use a statistical method to order attributes such as root node or internal node.

Math behind the decision tree algorithm: Before moving on to information gain, we first have to understand entropy.

Entropy: Entropy are the measurements of impurity, disorder, O uncertainty in a lot of examples.

Purpose of entropy:

Entropy controls how a decision tree decides break apart the data. It affects how a Decision tree draw its limits.

"Entropy values range from 0 a 1”, less the entropy value is more reliable.

Suppose we have characteristics F1, F2, F3, we select characteristic F1 as our root node

F1 contains 9 label yes and 5 no label, after dividing the F1 we get F2 which has 6 Yes / 2 No and F3 you have 3 Yes / 3 no.

Now, if we try to calculate the Entropy of both F2 using the Entropy formula …

Putting the values in the formula:

Here, 6 is the number of yeses taken as positive since we are calculating the probability divided by 8 is the total of rows present in F2.

In the same way, if we carry out Entropy for F3 we will obtain 1 bit which is a case of an attribute since in it there is 50%, Yes, and 50% no.

This division will continue unless and until we obtain a pure subset.

What is a Puresubset?

The pure subset is a situation in which we will get all yes or all no in this case.

We have done this with respect to a node, What if after dividing F2 we can also require some other attribute to get to the leaf node and we also have to take the entropy of those values and add them to send all those entropy values for that? we have the concept of information gain.

Information gain: The information gain is used to decide which function to divide into at each step of the tree construction. Simplicity is the best, that's why we want our tree to be small. To do it, in each step we must choose the division that results in the purest child nodes. A commonly used measure of purity is called information.

For each node in the tree, the information value measures how much information a characteristic gives us about the class. The division with the highest information gain will be taken as the first division and the process will continue until all secondary nodes are pure or until the information gain is 0.

The algorithm calculates the information gain for each division and the division that gives the highest information gain value is selected.

We can say that in Information Gain we are going to calculate the average of all entropy as a function of the specific division.

Sv = Total sample after division as in F2 there is 6 Yes

S = Total sample as in F1 = 9 + 5 = 14

Now calculating the information gain:

This way, the algorithm will do this for n number of divisions, and the gain of information for the division that is greater is going to take to build the decision tree.

The higher the value of the information gain of the division, the greater the probability that it will be selected for the particular division.

Gini impurity:

The Gini impurity is a measure used to construct decision trees to determine how the characteristics of a data set should divide the nodes to form the tree. More precisely, the Gini impurity of a data set is a number between 0-0,5, indicating the probability that new and random data will be misclassified if they are assigned a random class label according to the distribution of classes in the data set.

Entropy vs Impurity of Gini

The maximum entropy value is 1, while the maximum Gini impurity value is 0,5.

Like the Gini Impurit

In this article, we have covered a lot of details about the decision tree, how it works and the math behind it, attribute selection measures such as Entropy, Information gain, Gini impurity with its formulas and how the machine learning algorithm solves it.

At this stage, I hope you got an idea about the decision tree, one of the best machine learning algorithms to solve a classification problem.

Like new, I advise you to learn these techniques and understand their implementation and then implement them in your models.

for better understanding, see https://scikit-learn.org/stable/modules/tree.html

The media shown in this article is not the property of Analytics Vidhya and is used at the author's discretion.

Decision tree algorithm for classification: machine learning 101

Contents

Overview

Decision tree algorithm

What is a Puresubset?

Gini impurity:

Entropy vs Impurity of Gini

Related

Recent posts

Artificial Intelligence in Video: How New Technologies Are Changing Video Production?

IT profiles you should consider

How to record a screen on Windows computer?

¿Do you know the seniority levels?

Find Your Best Slip Rings and Rotary Joints Here

Posittion Agency: Advantages of link building for an online store

Subscribe to our Newsletter

Gaming

Brands

Business

Languages

Decision tree algorithm for classification: machine learning 101

Contents

Overview

Decision tree algorithm

What is a Puresubset?

Gini impurity:

Entropy vs Impurity of Gini

Related

Related Posts:

Recent posts

Artificial Intelligence in Video: How New Technologies Are Changing Video Production?

IT profiles you should consider

How to record a screen on Windows computer?

¿Do you know the seniority levels?

Find Your Best Slip Rings and Rotary Joints Here

Posittion Agency: Advantages of link building for an online store

Subscribe to our Newsletter

Gaming

Brands

Business

Languages