Multiple Linear Regression

The following two tabs change content below.
I am a technology enthusiast and always up for challenges. Recently I have started getting hands dirty in machine learning using python and aspiring to gather everything I can.

Latest posts by Renuka Joshi (see all)

Hello all! In the previous article we saw how to implement simple linear regression in machine learning.In this tutorial we are will implement Multiple Linear Regression.

Multiple Linear Regression

Multiple linear regression is similar to simple linear regression.In simple linear regression we had only one dependent and one independent variable whereas, in multiple linear regression we will teach machine to predict the values of dependent variable from two or more independent variables.Let’s get started.

Mathematical equation of multiple linear regression:

y = b1X1+b2X2+b3X3+….+bnXn

Here,

  • y is dependent variable
  • b1,b2… are constants
  • X1,X2… are independent variables

Dataset

We will predict salary of employees from the years of experience,total number of certifications,total number of worked hours and the department where employee is working using multiple linear regression.

Depatment WorkedHours Certification YearsExperience Salary
Development 2300 0 1.1 39343
Testing 2100 1 1.3 46205
Development 2104 2 1.5 37731
UX Designer 1200 1 2 43525
Testing 1254 2 2.2 39891
UX Designer 1236 1 2.9 56642
Development 1452 2 3 60150
Testing 1789 1 3.2 54445
UX Designer 1645 1 3.2 64445
UX Designer 1258 0 3.7 57189
Testing 1478 3 3.9 63218
Development 1257 2 4 55794
Development 1596 1 4 56957
Testing 1256 2 4.1 57081
UX Designer 1489 3 4.5 61111
Development 1236 3 4.9 67938
Testing 2311 2 5.1 66029
UX Designer 2245 3 5.3 83088
Development 2365 1 5.9 81363
Development 1500 3 6 93940
Testing 1456 2 6.8 91738
Testing 1760 1 7.1 98273
UX Designer 2400 4 7.9 101302
Development 2148 3 8.2 113812
UX Designer 1450 2 8.7 109431
UX Designer 1000 4 9 105582
Testing 1540 3 9.5 116969
Development 1500 2 9.6 112635
Testing 3000 4 10.3 122391
UX Designer 2100 3 10.5 121872

Data Preprocessing

We will use data preprocessing template which we have created previously.

Here,we have preprocessed our data.

Dummy Variables

When we encode the categorical data,skLearn library in Python creates separate column for each categorical data.For example,In our dataset ‘Employee_Data.csv’ contains Department as categorical data like development,testing,UX.So,when we encode this data,we get separate column created for all three categories as follows.

Development Testing UX
1 0 0
0 1 0
1 0 0
0 0 1

While implementing multiple linear regression we will eliminate one dummy variable.For example,in the first row development = 1, testing = 0 and UX = 0.Each row should contain value one in only one of the column.So consider the first row

  • If we remove development column then testing and UX are 0 which mean development should be 1
  • If we remove testing column,development is already 1 and all the other columns should be 0
  • If we remove UX column,development is already 1 and all the other columns should be 0

 

Multiple Linear Regression Implementation

Now we will implement machine learning using python.

 

In above code snippet we have used data preprocessing template and after splitting dataset into train and test.we have used LinearRegression class from sklearn.linear_model library exactly same as we used in simple linear regression.After executing the above code we will have predicted values for X_test in y_pred that is y_pred will have salaries predicted from the data available in X_test.

compare y_test and 7_pred for multiple linear regression
compare y_test and 7_pred for multiple linear regression

 

References:

 

 

 

Share Button

Renuka Joshi

I am a technology enthusiast and always up for challenges. Recently I have started getting hands dirty in machine learning using python and aspiring to gather everything I can.

4 thoughts on “Multiple Linear Regression

Leave a Reply

Your email address will not be published. Required fields are marked *