1:input: γ (lr), θ0 (params), f(θ) (objective), λ (weight decay),
2:μ (momentum), τ (dampening), nesterov, maximize
3:for t=1 to ... do
4:gt←∇θft(θt−1)
5:if λ=0 then
6:gt←gt+λθt−1
7:end if
8:if μ=0 then
9:if t>1 then
10:bt←μbt−1+(1−τ)gt
11:else
12:bt←gt
13:end if
14:if nesterov then
15:gt←gt+μbt
16:else
17:gt←bt
18:end if
19:end if
20:if maximize then
21:θt←θt−1+γgt
22:else
23:θt←θt−1−γgt
24:end if
25:end for
26:return θt