Thanks everyone for attending. In this post I’d like to provide a platform for future discussions of my talk on “Recent Advances in Stochastic Convex and Non-Convex Optimization”. The link for the talk website is here: http://people.csail.mit.edu/zeyuan/topics/icml-2017, and the video (with my personal recording) is already online.
- When discussing “one-point” convexity, I stated that there is a weak version of SVRG (see here and here) that applies to 3 out of 4 assumptions. As for the remaining one (i.e., the so-called P-L condition), I mistakenly said “it is open whether SVRG applies there”. In fact, the “approximate stationary point” results of SVRG which I described at the end (see here and here) directly applies to one-point convex functions satisfying the P-L condition.
- I mentioned that a few non-accelerated methods, such as SDCA/SVRG/SAGA, do not have parallel speed-up in the convex setting. As I pointed out in the talk, this statement is for the worst case, and without using additional assumption. One should expect that under some data-correlation assumption, such methods still enjoy certain parallel speed-up. See for instance here the “ESO assumption” for coordinate descent.
- Towards the end of the talk, I mentioned one can use online eigenvector algorithms to find local minima faster. Here is a paper I wrote after the conference which made it possible. It gives an online algorithm which finds local minima faster than SGD.
Should you have more questions regarding the talk, please don’t hesitate to drop me an email, or leave your comments here.