People generally believe that a PC can handle SVD for matrices only up to thousands by thousands. Textbooks also suggest that it is not wise to compute singular vectors one by one. In this post, I’ll refute both statements. In particular, … Continue reading
SGD is well-known for large-scale optimization. In my mind, there are so-far two fundamentally different improvements since the original introduction of SGD: (1) variance reduction, and (2) acceleration. The following picture predicts —in theory— the performance difference between the naive … Continue reading
SGD is well-known for large-scale optimization. In my mind, there are two (and only two) fundamental improvements since the original introduction of SGD: (1) variance reduction, and (2) acceleration. In this guest post at Princeton’s OptiML group, I’d love to conduct … Continue reading
If one wants to compute the minimum enclosing ball (MinEB) of a set of points, would you believe that the running time can be improved by a significant factor if we randomly rotate the space? While seemingly very counter-intuitive because a … Continue reading
The third one is mentioned in this blog post. The other two are coming! PS: the video of the three talks are on YouTube now.
When minimizing a convex function using first-order methods, if full gradients are too costly to compute at each iteration, there are two alternatives that can reduce this per-iteration cost. One is to use a (random) coordinate gradient , and the other is … Continue reading
I am often asked what is the best algorithm to solve SVM, to solve Lasso Regression, to solve Logistic Regression, etc. At the same time, a growing number of first-order methods have been recently proposed, making even experts hard to … Continue reading