Thanks everyone for attending. In this post I’d like to provide a platform for future discussions of my talk on “Recent Advances in Stochastic Convex and Non-Convex Optimization”. The link for the talk website is here: http://people.csail.mit.edu/zeyuan/topics/icml-2017, and the video (with … Continue reading
People generally believe that a PC can handle SVD for matrices only up to thousands by thousands. Textbooks also suggest that it is not wise to compute singular vectors one by one. In this post, I’ll refute both statements. In particular, … Continue reading
SGD is well-known for large-scale optimization. In my mind, there are so-far two fundamentally different improvements since the original introduction of SGD: (1) variance reduction, and (2) acceleration. The following picture predicts —in theory— the performance difference between the naive … Continue reading
SGD is well-known for large-scale optimization. In my mind, there are two (and only two) fundamental improvements since the original introduction of SGD: (1) variance reduction, and (2) acceleration. In this guest post at Princeton’s OptiML group, I’d love to conduct … Continue reading
If one wants to compute the minimum enclosing ball (MinEB) of a set of points, would you believe that the running time can be improved by a significant factor if we randomly rotate the space? While seemingly very counter-intuitive because a … Continue reading
The third one is mentioned in this blog post. The other two are coming! PS: the video of the three talks are on YouTube now.
When minimizing a convex function using first-order methods, if full gradients are too costly to compute at each iteration, there are two alternatives that can reduce this per-iteration cost. One is to use a (random) coordinate gradient , and the other is … Continue reading