See also: slides 13, slides 14, slides 15
Think of contiguous loss function: margin loss, cross-entropy/negative log-likelihood, etc.
linear programming
Given that data is linearly separable
So
So
perceptron
Rosenblatt’s perceptron algorithm
greedy update
SVM
idea: maximizes margin and more robus to “perturbations”
Eucledian distance between two points and the hyperplan parametrized by is:
Assuming then the distance is
maximum margin hyperplane
has margin if
Margin: