Naive Bayes Classification

The following two tabs change content below.
Prasad Kharkar is a java enthusiast and always keen to explore and learn java technologies. He is SCJP,OCPWCD, OCEJPAD and aspires to be java architect.

Latest posts by Prasad Kharkar (see all)

Hello all, welcome to another machine learning tutorial. Here, we will learn about Naive Bayes classification model. This article is quite similar to all previous classification articles because we are simply using new python libraries for classifiers and we are not changing the way data preprocessing and graphs are plotted.

Naive Bayes Classification:

Naive bayes classification uses bayes theorem to determine class of new data points.

Consider our example  from logistic regresesion, where we want to know whether a new user will buy the car or not.

Data Preprocessing:

  • We read “Social_Network_Ads.csv” file and stored in dataset.
  • Extracted age and salary information from dataset and stored in X
  • Extracted purchase information from dataset and stored in Y.
  • Split dataset in training and test set so that machine can be trained using X_train and Y_train
  • Used feature scaling for X_train.

Naive Bayes Classification:

  • Imported GaussianNB from sklearn.naive_bayes
  • Created a classifier and fitted training set to it.

Plotting the Graph:

  • aranged_ages variable will have scaled ages of users starting from minimum age to maximum age incremented by 0.01.
  • aranged_salaries variable will have scaled salaries of users starting from minimum salary to maximum salary incremented by 0.01.
  • np.meshgrid() takes aranged_ages and aranged_salaries to form X1 and X2.
  • X1 and X2 are used for creating a graph which classifies all data points using naive bayes classification. It is done using plt.contourf(), method.
Naive Bayes Classification

Naive Bayes Classification

  • Naive bayes classification draws a curve and creates orange and blue sections
  • orange section is of users who will not buy the car and blue section is for users who will buy the car

Plotting Test set:

Above code plots actual data points in classification.

  • Red points denote users who did not buy the car
  • Green points denote users who bought the car.
Naive Bayes Results

Naive Bayes Results

Note that we have plotted 100 observations from our test set and out of them

  • 7 green points are observed on orange area
  • 3 red points are observed in blue area

This means, out of 100 observation points, Naive Bayes  classification predicted 90 results correctly and only 10 are incorrect.

I hope this helped. Happy learning 🙂


Share Button

Leave a Reply

Your email address will not be published. Required fields are marked *