Thanks everyone for attending. In this post I’d like to provide a platform for future discussions of my talk on “Recent Advances in Stochastic Convex and Non-Convex Optimization”. The link for the talk website is here: http://people.csail.mit.edu/zeyuan/topics/icml-2017. (Video shall be available soon.)
- When discussing “one-point” convexity, I stated that there is a weak version of SVRG (see here and here) that applies to 3 out of 4 assumptions. As for the remaining one (i.e., the so-called P-L condition), I mistakenly said “it is open whether SVRG applies there”. In fact, the “approximate stationary point” results of SVRG which I described at the end (see here and here) directly applies to one-point convex functions satisfying the P-L condition.
- I mentioned that a few non-accelerated methods, such as SDCA/SVRG/SAGA, do not have parallel speed-up in the convex setting. As I pointed out in the talk, this statement is for the worst case, and without using additional assumption. One should expect that under some data-correlation assumption, such methods still enjoy certain parallel speed-up. See for instance here the “ESO assumption” for coordinate descent.
Should you have more questions regarding the talk, please don’t hesitate to drop me an email, or leave your comments here.