Lecture Notes of 21/10/2020 and 22/10/2020 by Himanshu Bimal


MathJax TeX Test Page

Support Vector Machines (SVM) and Kernel trick

What is a Support Vector Machine?

A Support Vector Machine (SVM) is a supervised machine learning algorithm that can be used for classification purpose. The theoretical basis for SVM is the idea of finding a hyperplane that divides a given dataset into two classes in the best possible manner.

Problem of Non-linear decision boundary (hyperplane)

It is possible that in many of the problems of classification we may not have a linear decision boundary or hyperplane. In such cases we use the support vector machines to do the classification by producing nonlinear boundaries which are constructed as a result of linear boundary in a higher and transformed version of the feature space.

Working of a Support Vector Machine

A support vector machine (SVM) constructs a hyperplane or set of hyperplanes in a higher dimensional space. A "good" separation is achieved by the hyperplane that has the largest distance to the nearest training data point of any class, that is, we try to find a decision boundary that maximizes the margin. We do this in order to minimize the generalization error as much as possible, since the larger the margin the lower is the generalization error of the classifier.
The problem of non-linear decision boundary is solved by using the kernel tricks.

The Kernel Trick

In most of the ideal cases the data is linear and thus we can find a separating hyperplane that divides the data into two classes. It may happen in most of the practical situations that the data is not linear and hence the dataset is inseparable, in such cases we use the kernel trick which maps or transforms the input data into a higher dimensional space non-linearly. The new transformation that we get after this process can be separated linearly. In layman terms it can be said that the kernel trick allows the SVM to form non-linear boundaries.

Radial Basis Function (Gaussian Kernel)

The Radial basis function is a popular kernel also known as Gaussian Kernel. We define this kernel as, $$ k(X,Y) = exp\Big(-\frac{||X-Y||^2}{2\sigma^2}\Big) $$ or $$ k(X,Y) = exp\Big(-\gamma ||X-Y||^2\Big) $$ where $$ \gamma = \frac{1}{2\sigma^2} $$ and $$||.||$$ denotes the Euclidean norm.

Support vectors in SVM

The support vectors are the data points that lie closest to the decision boundaries or hyperplanes. They are the data points that are most difficult to classify. As the name 'support' goes, the decision boundary or hyperplane is supported on these vectors and moving or removing them would change the decision boundaries. And so the entire SVM model is based on the concept of support vectors.

Advantages and Disadvantages of SVM


References

  1. RBF in SVM.
  2. Kernel trick.
  3. Support vectors diagram.
  4. The Elements of Statistical learning by Trevor Hastie, et al. .