Part 6 - Implementing Support Vector Machines (SVM) in Python
Machine Learning Algorithms Series - Classification with scikit-learn
This article explains how to implement Support Vector Machines (SVM) in Python for classification problems. It covers importing necessary libraries, preparing data, training the model, making predictions, and evaluating performance using accuracy scores and confusion matrices.
Introduction to Support Vector Machines (SVM)
Support Vector Machines (SVM) is a powerful classification algorithm that works by finding the hyperplane that best separates classes in the feature space. SVM aims to maximize the margin between the classes, making it a good choice for binary classification, especially when the classes are well separated.
Step-by-Step Implementation
Importing Libraries:
Import the
SVCclass (Support Vector Classifier) fromsklearn.svm.Import
train_test_splitfromsklearn.model_selectionto divide the dataset into training and testing sets.Import
accuracy_scoreandconfusion_matrixfromsklearn.metricsto evaluate the model.Import
numpyfor numerical operations.
Preparing Data:
Create a NumPy array
Xrepresenting the hours studied and prior grades of students.Create a NumPy array
yrepresenting the outcomes (0 for fail, 1 for pass).
Splitting the Data:
Use
train_test_splitto divide the data into training and testing sets.Specify
test_size=0.2to use 20% of the data for testing and 80% for training.Set
random_state=42for reproducibility.
Initializing and Training the Model:
Create an instance of the
SVCclass, specifying the kernel type. For a linear SVM, usekernel='linear'. This attempts to find a linear hyperplane that best separates the two classes.Train the model using the training data
(X_train, y_train). The model learns the optimal hyperplane that separates the data points belonging to each class, maximizing the margin between them.
Making Predictions:
Use the trained SVM model to make predictions on the test data
X_test.The output
y_predcontains the model's predictions (0 or 1).
Evaluating the Model:
Calculate the accuracy of the model by comparing the actual values
y_testto the predicted valuesy_predusingaccuracy_score.Compute the confusion matrix to understand true positives, true negatives, false positives, and false negatives.
Print the accuracy and confusion matrix.
Complete Code Example
# Import necessary libraries
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, confusion_matrix
import numpy as np
# Prepare data
X = np.array([[,], [,], [,], [,], [,], [,], [,], [,], [,], [,]])
y = np.array()
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Initialize the SVM classifier
model = SVC(kernel='linear')
# Train the model
model.fit(X_train, y_train)
# Make predictions on the test set
y_pred = model.predict(X_test)
# Evaluate the model
accuracy = accuracy_score(y_test, y_pred)
confusion_matrix = confusion_matrix(y_test, y_pred)
# Print the results
print("Accuracy:", accuracy)
print("Confusion Matrix:\n", confusion_matrix)
Conclusion
This article demonstrates a complete implementation of the Support Vector Machine classifier, showing how it uses hours studied and prior grades to classify students as passing or failing. The SVM model is trained, tested, and evaluated using both accuracy and confusion matrix.

