How to solve classification and regression fast, faster, and fastest

I am often asked what is the best algorithm to solve SVM, to solve Lasso Regression, to solve Logistic Regression, etc. At the same time, a growing number of first-order methods have been recently proposed, making even experts hard to track down the state-of-the-arts.

In this guest post on Princeton University’s OptiML blog, I finally have a chance to answer all these questions properly and simultaneously.

http://www.minimizingregret.com/2016/06/how-to-solve-classification-and-regression.html

Posted in Optimization | Leave a comment

The Complexity Zoo and Reductions in Optimization

— The following dilemma is encountered by many of my friends when teaching basic optimization: which variant/proof of gradient descent should one start with? Of course, one needs to decide on which depth of convex analysis one should dive into, and decide on issues such as “should I define strong-convexity?”, “discuss smoothness?”, “Nesterov acceleration?”, etc.

The following blog post by Elad Hazan and me made this pedagogical question much easier, through optimal reductions. It surprisingly has some non-trivial research impacts as well.

http://www.minimizingregret.com/2016/05/the-complexity-zoo-and-reductions-in.html

Posted in Optimization, Story | Leave a comment

Spielman-Teng-A: A Local Clustering Algorithm for Massive Graphs and its Application to Nearly-Linear Time Graph Partitioning

Spielman-Teng’s 2004 paper [ST04] on almost-linear Laplacian linear solver has been divided and re-written into three very technically involved ones [ST08a], [ST08b] and [ST08c]. I am planning to write memos for all of them, while this post is the Part I for the first paper [ST08a: A Local Clustering Algorithm for Massive Graphs and its Application to Nearly-Linear Time Graph Partitioning]. Continue reading

Posted in Algorithm, Readings | Leave a comment

Spielman-Teng-B: Spectral Sparsification of Graphs (Part II)

Spielman-Teng’s 2004 paper [ST04] on almost-linear Laplacian linear solver has been divided and re-written into three very technically involved papers [ST08a], [ST08b] and [ST08c]. I am planning to write memos for all of them, while this post is the Part II for the second paper [ST08b: Spectral Sparsification of Graphs].

Recap & Today’s Goal

Last time I’ve covered the high level idea to construct a spectral sparsifier, that is to decompose the graph recursively into well-connected components, and then for each component do subsampling. I’ve presented Theorem 7.1 which is an ideal proof for the existance of good sparsifiers. It relies on an exact sparsest cut subroutine, and I am going to relax this assumption to an approximate almost-linaer time sparsest cut subroutine, provided in [ST08a]. Continue reading

Posted in Algorithm, Readings | Leave a comment

Spielman-Teng-B: Spectral Sparsification of Graphs (Part I)

Spielman-Teng’s 2004 paper [ST04] on almost-linear Laplacian linear solver has been divided and re-written into three very technically involved papers [ST08a], [ST08b] and [ST08c]. I am planning to write memos for all of them, while this post is the Part I for the second paper [ST08b: Spectral Sparsification of Graphs].

Outline

The goal of [ST08b] is to construct a re-weighted subgraph of G with \tilde{O}(n/\varepsilon^2) edges that is a (1+\varepsilon)-spectral-approximation to G. This will be used later in [ST08c] to construct an ultra-sparsifier. It uses [ST08a] as a subroutine to construct decompositions. Continue reading

Posted in Algorithm, Readings | Leave a comment

Constant Factor Approximation of Vertex-Cuts in Planar Graphs

There has been a long time I couldn’t remember to write memos for the papers I’ve read, and it’s time for me to come back to this blog, coz my memory is fading all the time and blogging is a good way to remind me of ideas. I’ve been working on planar graph separators with Christian Sommer for a semester, and learned many interesting stuff.

This time I’m going to blog Amir-Krauthgamer-Rao’s STOC 2003 result on planar graph vertex-cut problem. Continue reading

Posted in Algorithm, Readings | Leave a comment

Learning Algorithms for DT, DNF, AC0 and r-juntas

This is a section of my reading notes to Prof. Ryan O’Donnell’s <Analysis of Boolean Functions>, I’ve summarized the learning algorithms in this post, which also functions as a personal reference.

Class

Random Examples

Membership Queries

poly-size DTs

{n^{O(\log(n/\varepsilon))}}
Theorem 5

{\mathrm{poly}(n/\varepsilon)}
Theorem 6

poly-size DNFs

{n^{O(\log(n/\varepsilon))}}
Theorem 7

{\mathrm{poly}(n/\varepsilon)}
({n^{O(\log\log n)}} in Theorem 11)

poly-size
depth-d circuits

{n^{O(\log^{d-1}(n/\varepsilon))}}
Theorem 13

same

{\log(n)}-juntas

{n^{0.704\log(n)+O(1)}}
Theorem 14

{\mathrm{poly}(n)}
(implied by Theorem 6)

  Continue reading

Posted in Learning Theory, Readings | Leave a comment