Hello all!We have learnt about three different forms of regression in previous tutorials.Now we will learn yet another interesting form of regression i.e. Support Vector Regression(SVR).
Support Vector Regression:
We will perform SVR on same example that we have used in polynomial regression.But,before that we need to apply feature scaling to our dataset because SVR libraries do not apply feature scaling by default.Feature scaling is a process of transforming variables to have values in same range so that no variable is dominated by the other.
# Importing the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
# Importing the dataset
dataset = pd.read_csv('Position_Salaries.csv')
X = dataset.iloc[:, 1:2].values
y = dataset.iloc[:, 2:3].values
# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc_X = StandardScaler()
sc_y = StandardScaler()
X = sc_X.fit_transform(X)
y = sc_y.fit_transform(y)
- dataset.iloc[:,1:2].values and dataset.iloc[:,2:3] creates x and y as matrices instead of vectors.
- Here,in our example as the value of y increases significantly as x increases.So,values in y dominates values in x and hence we need to scale the data to implement SVR.
- StandardScalar() is the class used for feature scaling from sklearn library.
- sc_x.fit_transform(X) will transform and fit X into scaled data
- sc_y.fit_transform(y) will transform and fit y into scaled data
Scaled data is as follows.
# Fitting SVR to the dataset
from sklearn.svm import SVR
regressor = SVR(kernel = 'rbf')
# Predicting a new result
y_pred = sc_y.inverse_transform(regressor.predict(sc_X.transform(np.array([[8.3]]))))
- Import SVR class from sklearn library
- create regressor SVR class object with kernel ‘rbf’ because this follows Gaussian process
- fit scaled X and y to the object regressor
- y_pred will be predicted salary for newly joined employee with 8.3 level
- np.array will create a 2d array consisting only single column
- sc_y.inverse_transform will convert scaled value of salary into actual salary
Let us simply plot the graph for Support Vector Regression now.
# Visualising the SVR results (for higher resolution and smoother curve)
X_grid = np.arange(min(X), max(X), 0.01)
X_grid = X_grid.reshape((len(X_grid), 1))
plt.scatter(X, y, color = 'red')
plt.plot(X_grid, regressor.predict(X_grid), color = 'blue')
Note the graph shown beside. SVR graph has predicted the results with scaled data. Note that last observation is left unconsidered, well that is because SVR felt the observation was too far from actual observation to be taken into account.
Note that predicted salary is 203700, which is close to what employee asked for i.e. 190000.
I hope this article helped understand Support Vector Regression.