Q&A 24 How do you evaluate clustering quality using silhouette score and ARI?

24.1 Explanation

Unlike supervised models, clustering lacks ground truth for labels. Evaluation uses internal or external metrics:

Silhouette Score: Measures how well a point fits into its cluster vs. others (range: -1 to 1).
Adjusted Rand Index (ARI): Compares predicted vs. true clusters (if available) while adjusting for chance.

Use silhouette when ground truth is unavailable; use ARI when you have labeled data.

24.2 Python Code

from sklearn.metrics import silhouette_score, adjusted_rand_score

# Silhouette Score
score = silhouette_score(X_pca, labels)
print(f"Silhouette Score: {score:.2f}")

# ARI (if ground truth exists)
true_labels = df["Survived"].values  # assuming it's relevant
ari = adjusted_rand_score(true_labels, labels)
print(f"Adjusted Rand Index: {ari:.2f}")

Silhouette Score: 0.46
Adjusted Rand Index: 0.01

24.3 R Code

library(cluster)
library(mclust)

# Silhouette Score
sil <- silhouette(km$cluster, dist(X_pca))
mean_sil <- mean(sil[, 3])
print(paste("Silhouette Score:", round(mean_sil, 2)))

[1] "Silhouette Score: 0.46"

# ARI (if true labels available)
ari_score <- adjustedRandIndex(df$Survived, km$cluster)
print(paste("Adjusted Rand Index:", round(ari_score, 2)))

[1] "Adjusted Rand Index: 0.01"

✅ Takeaway: Silhouette score reveals internal clustering quality; ARI is useful when comparing to known labels.