This article was published as part of the Data Science Blogathon.
Introduction
I will discuss this topic in detail below..
Steps of linear regression
As the name suggests, the idea behind doing linear regression is that we should arrive at a linear equation that describes the relationship between the dependent and independent variables.
Paso 1
Suppose we have a dataset where x is the variableIn statistics and mathematics, a "variable" is a symbol that represents a value that can change or vary. There are different types of variables, and qualitative, that describe non-numerical characteristics, and quantitative, representing numerical quantities. Variables are fundamental in experiments and studies, since they allow the analysis of relationships and patterns between different elements, facilitating the understanding of complex phenomena.... Independent e Y is a function of x (Y= f (x)). Therefore, using linear regression we can form the following equation (equation for best fit line):
Y = mx + c
This is an equation of a straight line where m is the slope of the line and c is the intercept.
Paso 2
Now, to derive the best fit line, first we assign random values to my and c and we calculate the corresponding value of Y for a given x. This Y value is the output value.
Paso 3
How logistic regression is a supervised machine learning algorithm, we already know the value of real Y (dependent variable). Now, as we have our calculated output value (let's represent it as ŷ), we can check whether our prediction is accurate or not.
In the case of linear regression, we calculate this error (residual) using MSE method (mean square error) and we call it Loss functionThe loss function is a fundamental tool in machine learning that quantifies the discrepancy between model predictions and actual values. Its goal is to guide the training process by minimizing this difference, thus allowing the model to learn more effectively. There are different types of loss functions, such as mean square error and cross-entropy, each one suitable for different tasks and...:
The loss function can be written as:
L = 1 / n ∑ ((Y – ŷ)2)
Where n is the number of observations.
Paso 4
To achieve the best fit line, we have to minimize the value of the loss function.
To minimize the loss function, We use a technique called descent of gradientGradient is a term used in various fields, such as mathematics and computer science, to describe a continuous variation of values. In mathematics, refers to the rate of change of a function, while in graphic design, Applies to color transition. This concept is essential to understand phenomena such as optimization in algorithms and visual representation of data, allowing a better interpretation and analysis in....
Let's analyze how gradient descent works (although I will not delve into the details, since this is not the focus of this article).
Gradient descent
If we look at the formula for the loss function, the 'mean square error’ means that the error is represented in second order terms.
If we graph the loss function for the weight (in our equation the weights are myc), will be a parabolic curve. Now that our bike is to minimize the loss function, we have to get to the end of the curve.
To achieve this, we must take the first-order derivative of the loss function for the weights (myc). Then we will subtract the result of the derivative of the initial weight by multiplying by a learning rate (a). We will continue repeating this step until we reach the minimum value (we call it global minima). We set a threshold of a very small value (example: 0.0001) as global minimums. If we don't set the threshold value, it can take forever to reach the exact zero value.
Paso 5
Once the loss function is minimized, we obtain the final equation for the best fit line and can predict the value of Y for any given X.
This is where linear regression ends and we are just one step away from getting to logistic regression..
Logistic regression
As I said before, fundamentally, logistic regression is used to classify elements of a set into two groups (binary classification) calculating the probability of each element of the set.
Steps of logistic regression
In logistic regression, we decide a probability threshold. If the probability of a particular item is greater than the probability threshold, we classify that element in a group or vice versa.
Paso 1
To calculate the binary separation, first, we determine the best fitted line following the steps of Linear Regression.
Paso 2
The regression line that we get from linear regression is very susceptible to outliers. Therefore, will not do a good job of classifying two classes.
Therefore, the predicted value is converted to probability by feeding it to the sigmoid function.
The sigmoid equation:
As we can see in Fig. 3, we can feed any real number to the sigmoid function and it will return a value between 0 Y 1.
Fig 2: Sigmoid curve (image taken from Wikipedia)
Therefore, if we feed the output ŷ value to the sigmoid function re-tunes a probability value between 0 Y 1.
Paso 3
Finally, the output value of the sigmoid function becomes 0 O 1 (discrete values) according to the threshold value. As usual, we set the threshold value to 0,5. Thus, we obtain the binary classification.
Now that we have the basic idea of how linear regression and logistic regression are related, let's review the process with an example.
Example
Consider a problem in which we are provided with a data set containing the height and weight of a group of people. Our task is to predict the Weight for new entries in the Height column.
So we can find out that this is a regression problem in which we will build a linear regression model. We will train the model with the height and weight values provided. Once the model is trained, we can predict the weight for a given unknown height value.
Fig 3: Linear regression
Now suppose we have an additional field Obesity and we have to classify whether a person is obese or not based on their provided height and weight. This is clearly a classification problem in which we have to segregate the dataset into two classes (obese and non-obese).
Then, for the new problem, we can go through the steps of Linear Regression again and construct a regression line. This time, The line will be based on two parametersThe "parameters" are variables or criteria that are used to define, measure or evaluate a phenomenon or system. In various fields such as statistics, Computer Science and Scientific Research, Parameters are critical to establishing norms and standards that guide data analysis and interpretation. Their proper selection and handling are crucial to obtain accurate and relevant results in any study or project.... Height and Weight and the regression line will fit between two sets of discrete values. Since this regression line is very susceptible to outliers, will not serve to classify two classes.
To get a better ranking, we will feed the output values of the regression line to the sigmoid function. The sigmoid function returns the probability of each output value of the regression line. Now, based on a predefined threshold value, we can easily classify the output into two classes of obese or non-obese.
Finally, we can summarize the similarities and differences between these two models.
The Similarities Between Linear Regression and Logistic Regression
- Both linear regression and logistic regression are supervised machine learning algorithms.
- Linear regression and logistic regression, both models are parametric regression, namely, both models use linear equations for predictions.
Those are all the similarities we have between these two models.
But nevertheless, in terms of functionality, these two are completely different. Below are the differences.
Differences between linear regression and logistic regression
- Linear regression is used to handle regression problems, while logistic regression is used to handle classification problems.
- Linear Regression Provides Continuous Output, but logistic regression provides a discrete output.
- The purpose of linear regression is to find the best fit line, while logistic regression is one step ahead and fits the values of the line to the sigmoid curve.
- The method to calculate the loss function in linear regression is the root mean square error, while for the logistic regression it is the maximum likelihood estimate.
Note: When writing this article, I assumed that the reader is already familiar with the basic concept of linear regression and logistic regression. I hope this article explains the relationship between these two concepts.