Scikit Linear SVM plot decision boundary for input array of size n
I am currently performing multi class SVM with linear kernel using python's scikit library. The sample training data and testing data are as given below:
Model data:
x = [[20,32,45,33,32,44,0],[23,32,45,12,32,66,11],[16,32,45,12,32,44,23],[120,2,55,62,82,14,81],[30,222,115,12,42,64,91],[220,12,55,222,82,14,181],[30,222,315,12,222,64,111]]
y = [0,0,0,1,1,2,2]
I want to plot the decision boundary and visualize the datasets. Can someone please help to plot this type of data.
The data given above is just mock data so feel free to change the values. It would be helpful if at least if you could suggest the steps that are to followed. Thanks in advance
1 answer

You have to choose only 2 features to do this. The reason is that you cannot plot a 7D plot. After selecting the 2 features use only these for the visualization of the decision surface.
Now, the next question that you would ask if
How can I choose these 2 features?
. Well, there are a lot of ways. You could do aunivariate Fvalue (feature ranking) test
and see what features/variables are the most important. Then you could use these for the plot. Also, we could reduce the dimensionality from 7 to 2 usingPCA
for example.
2D plot for 2 features and using the iris dataset
from sklearn.svm import SVC import numpy as np import matplotlib.pyplot as plt from sklearn import svm, datasets iris = datasets.load_iris() # Select 2 features / variable for the 2D plot that we are going to create. X = iris.data[:, :2] # we only take the first two features. y = iris.target def make_meshgrid(x, y, h=.02): x_min, x_max = x.min()  1, x.max() + 1 y_min, y_max = y.min()  1, y.max() + 1 xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h)) return xx, yy def plot_contours(ax, clf, xx, yy, **params): Z = clf.predict(np.c_[xx.ravel(), yy.ravel()]) Z = Z.reshape(xx.shape) out = ax.contourf(xx, yy, Z, **params) return out model = svm.SVC(kernel='linear') clf = model.fit(X, y) fig, ax = plt.subplots() # title for the plots title = ('Decision surface of linear SVC ') # Setup grid for plotting. X0, X1 = X[:, 0], X[:, 1] xx, yy = make_meshgrid(X0, X1) plot_contours(ax, clf, xx, yy, cmap=plt.cm.coolwarm, alpha=0.8) ax.scatter(X0, X1, c=y, cmap=plt.cm.coolwarm, s=20, edgecolors='k') ax.set_ylabel('y label here') ax.set_xlabel('x label here') ax.set_xticks(()) ax.set_yticks(()) ax.set_title(title) ax.legend() plt.show()
EDIT: Apply PCA to reduce dimensionality.
from sklearn.svm import SVC import numpy as np import matplotlib.pyplot as plt from sklearn import svm, datasets from sklearn.decomposition import PCA iris = datasets.load_iris() X = iris.data y = iris.target pca = PCA(n_components=2) Xreduced = pca.fit_transform(X) def make_meshgrid(x, y, h=.02): x_min, x_max = x.min()  1, x.max() + 1 y_min, y_max = y.min()  1, y.max() + 1 xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h)) return xx, yy def plot_contours(ax, clf, xx, yy, **params): Z = clf.predict(np.c_[xx.ravel(), yy.ravel()]) Z = Z.reshape(xx.shape) out = ax.contourf(xx, yy, Z, **params) return out model = svm.SVC(kernel='linear') clf = model.fit(Xreduced, y) fig, ax = plt.subplots() # title for the plots title = ('Decision surface of linear SVC ') # Setup grid for plotting. X0, X1 = Xreduced[:, 0], Xreduced[:, 1] xx, yy = make_meshgrid(X0, X1) plot_contours(ax, clf, xx, yy, cmap=plt.cm.coolwarm, alpha=0.8) ax.scatter(X0, X1, c=y, cmap=plt.cm.coolwarm, s=20, edgecolors='k') ax.set_ylabel('PC2') ax.set_xlabel('PC1') ax.set_xticks(()) ax.set_yticks(()) ax.set_title('Decison surface using the PCA transformed/projected features') ax.legend() plt.show()