Creates a criterion that optimizes a multi-class classification hinge loss (margin-based loss) between input x
(a 2D mini-batch Tensor) and output y (which is a 1D tensor of target class indices, 0≤y≤x.size(1)−1):
For each mini-batch sample, loss in terms of 1D input x and output y is:
loss(x,y)=x.size(0)∑imax0,margin−x[y]+x[i]p∵i∈{0,…x.size(0)−1} and i=y