— The following dilemma is encountered by many of my friends when teaching basic optimization: which variant/proof of gradient descent should one start with? Of course, one needs to decide on which depth of convex analysis one should dive into, and decide on issues such as “should I define strong-convexity?”, “discuss smoothness?”, “Nesterov acceleration?”, etc.

The following blog post by Elad Hazan and me made this pedagogical question much easier, through * optimal reductions*. It surprisingly has some non-trivial research impacts as well.

http://www.minimizingregret.com/2016/05/the-complexity-zoo-and-reductions-in.html