Let's understand linear regression OLS method

ยท

6 min read

I hope you are doing great. Today, we will discuss everything you need to know about the linear regression algorithm. Let me tell you that after going through this blog, I can assure you that you will be fully equipped with the linear regression algorithm as a tool in your toolkit.So without any further deley let's get started.

What is linear regression ?

Linear regression is a supervised machine leanring algorithm that is used to solve regression problems, where regression problems are basically those problems where for certain data input our model gives continuous predicted value as an output. Example: House price prediction and Salary prediction are some of the common examples of regression problems.

Assumptions of linear regression

Before moving on to understanding how does linear regression algorithm works you must be aware about what are the various assumptions which are made by this algorithm before solving regression problem. There are 3 main assumptions made by linear regression algorithm ๐Ÿ‘‡

  1. Linearity : The relationship between dependent and independent variables is linear or sort of linear, becasue of which with change in value of independent variable the output variable also gets affected. The residual plot can be used to check for linearity.

  2. No multicollinearity: The independent variables are not highly correlated with each other. The Variance Inflation Factor (VIF) can be used to check for multicollinearity. If the VIF for any independent variable is greater than 10, then there may be a problem with multicollinearity.

  3. Normality : The residuals are normally distributed which means they have bell shaped distribution. The Shapiro-Wilk test can be used to check for normality. If the p-value of the Shapiro-Wilk test is less than 0.05, then the residuals are not normally distributed.

๐Ÿ’ก
All the above 3 assumptions must be true for both simple and multiple linear regression problems.

Difference between simple and multiple linear regression

Since I mentioned simple and multiple linear regression under Assumptions of linear regression, so in case you are not aware about what is the differnce between these 2 terms then don't worry let me tell you the differnece between these 2 terms.

Simple linear regressionMultiple linear regression
In simple linear regression there is only one independent variableIn multiple linear regression there are more than one independent variable.
Equation : y = mx + bEquation : y = ax1 + bx2 + cx3 + ... + d
๐Ÿ‘€
Note: For around 99% of the times you will only find yourself solving multiple linear regression problems so in this blog post we will discuss multiple linear regression.

The Components of Linear Regression

For better understanding the working of linear regression algorithms it is important that you must be aware about various components of linear regression algorithm. There are basically 4 things you must know : Dependent variables, independent variables, slope and intercept. Let's dicuss each and every component one by one

  • Dependent variable: Dependent variable is basically a feature of variable which is being predicted by our model. Example: In house price prediction problem the price of the house is dependent variable.

  • Independent variables: Independent variables are all those variables or features which are used to predict the dependent variable and by change in value of one variable the dependent variable value gets affected but the change in one variable doesn't impact other variables and that is the reason why they are called independent variable.

  • Slope: The slope in linear regression is a model parameter that is used to define the importance of an independent variable for predicting the dependent variable. It is also used to define the rate with which the dependent variable value will change with a unit change in the independent variable value.

  • Intercept : The intercept in linear regression is a model parameter that is used to define a baseline value for the algorithm. This is useful in scenarios where the independent variable multiplied by the weight is zero, but there should still be some value as an output. For example, let's say we have a dataset about the experience and salary of individuals. If the experience is zero years (like an intern), then the output will be zero in the absence of the intercept. This means that the individual would not be earning any money, which is not realistic. The intercept allows us to account for this baseline value and make more accurate predictions.

How to find best fit line

To find the best fit line there are 2 different techniques which we can use and in the next couple of minutes we will first discuss a simple technique for finding the best fit line, followed by its drawback and finally the most efficient technique to find the best fit line. So let's get started.

๐Ÿ’ซ
The very first technique which we can use for finding the best fit line is called ordinary least square method.

Ordinary least square method

The ordinary least squares (OLS) method is a statistical technique used to find the best fit line for a set of data points. The method works by minimizing the sum of the squared residuals, which are simply the differences between the actual values and the predicted values from the regression line.

If we will carefully observe the error equation they we will see that to minimize the error we can only make some changes in the slope (m) and intercept (b). So we can say that we need to find those value of model parameters ( slope and intercept ) for which error is minimum.

Now in order to find the minimum value of these model parameters we will simply do the partial differentiation and we will equate the result to 0 to get the minimum value. To better understand the reaason behind why we need to equation the partial differentation of function wrt to model parameter to 0, take a look at the visual below.

On the x axis let say we have slope, on y axis we have intercept and on z axis we have error. Now you can clearly see that the error is minimum at the global minimum and at the global minimum the slope will be zero, thus we equate partial differentiation result ( slope) = 0.

Short Note

I hope you good understanding of what is linear regression,its assumptions and how does OLS method works for finding the best fit line. So if you liked this blog or have any suggestion kindly like this blog or leave a comment below it would mean a to me.

๐Ÿ’ก
Also I would love to connect with you, so here is my Twitterand linkedin

Did you find this article valuable?

Support Yuvraj Singh by becoming a sponsor. Any amount is appreciated!

ย