Q&A 12 How do you train a support vector machine (SVM) model?

12.1 Explanation

A Support Vector Machine (SVM) is a powerful classification algorithm that finds the best boundary (or hyperplane) to separate classes in feature space. It’s particularly effective for small to medium datasets with clear class separation.

We’ll train an SVM on the Titanic dataset using preprocessed features.


12.2 Python Code

# Train an SVM in Python
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

# Load and preprocess
df = pd.read_csv("data/titanic.csv")
df['Age'] = df['Age'].fillna(df['Age'].median())
df['Embarked'] = df['Embarked'].fillna(df['Embarked'].mode()[0])
df['Sex'] = df['Sex'].map({'male': 0, 'female': 1})
df = pd.get_dummies(df, columns=['Embarked'], drop_first=True)

# Features and target
X = df[['Pclass', 'Sex', 'Age', 'Fare', 'Embarked_Q', 'Embarked_S']]
y = df['Survived']

# Split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train SVM
svm_model = SVC(kernel='rbf', C=1.0, gamma='scale')  # RBF kernel by default
svm_model.fit(X_train, y_train)

# Predict and evaluate
y_pred = svm_model.predict(X_test)
print("Accuracy:", accuracy_score(y_test, y_pred))
Accuracy: 0.6536312849162011

12.3 R Code

# Train an SVM model in R
library(readr)
library(dplyr)
library(fastDummies)
library(caret)
library(e1071)

# Load and preprocess
df <- read_csv("data/titanic.csv")
df$Age[is.na(df$Age)] <- median(df$Age, na.rm = TRUE)
mode_embarked <- names(sort(table(df$Embarked), decreasing = TRUE))[1]
df$Embarked[is.na(df$Embarked)] <- mode_embarked
df$Sex <- ifelse(df$Sex == "male", 0, 1)
df <- fastDummies::dummy_cols(df, select_columns = "Embarked", remove_first_dummy = TRUE, remove_selected_columns = TRUE)

# Feature and target
features <- df %>% select(Pclass, Sex, Age, Fare, Embarked_Q, Embarked_S)
target <- df$Survived

# Split
set.seed(42)
split_index <- createDataPartition(target, p = 0.8, list = FALSE)
X_train <- features[split_index, ]
X_test <- features[-split_index, ]
y_train <- target[split_index]
y_test <- target[-split_index]

# Train SVM
svm_model <- svm(x = X_train, y = as.factor(y_train), kernel = "radial", cost = 1)

# Predict and evaluate
y_pred <- predict(svm_model, X_test)
confusionMatrix(y_pred, as.factor(y_test))
Confusion Matrix and Statistics

          Reference
Prediction   0   1
         0 108  27
         1   6  37
                                          
               Accuracy : 0.8146          
                 95% CI : (0.7496, 0.8688)
    No Information Rate : 0.6404          
    P-Value [Acc > NIR] : 2.805e-07       
                                          
                  Kappa : 0.5662          
                                          
 Mcnemar's Test P-Value : 0.0004985       
                                          
            Sensitivity : 0.9474          
            Specificity : 0.5781          
         Pos Pred Value : 0.8000          
         Neg Pred Value : 0.8605          
             Prevalence : 0.6404          
         Detection Rate : 0.6067          
   Detection Prevalence : 0.7584          
      Balanced Accuracy : 0.7627          
                                          
       'Positive' Class : 0               
                                          

âś… Takeaway: SVMs are effective for non-linear classification and work well with medium-sized feature spaces. Try adjusting the kernel and cost parameters to tune performance.