Spielman-Teng-B: Spectral Sparsification of Graphs (Part I)

Spielman-Teng’s 2004 paper [ST04] on almost-linear Laplacian linear solver has been divided and re-written into three very technically involved papers [ST08a], [ST08b] and [ST08c]. I am planning to write memos for all of them, while this post is the Part I for the second paper [ST08b: Spectral Sparsification of Graphs].

Outline

The goal of [ST08b] is to construct a re-weighted subgraph of G with \tilde{O}(n/\varepsilon^2) edges that is a (1+\varepsilon)-spectral-approximation to G. This will be used later in [ST08c] to construct an ultra-sparsifier. It uses [ST08a] as a subroutine to construct decompositions.

A useful fact shown in Theorem 6.1 (which I am not going to present in details) is that for a graph with smallest non-zero normalized Laplacian eigenvalue \lambda, one can simply sample edges with probability roughly proportional to 1/\lambda^2, so that the new graph has O(\frac{\log ^2 n}{\varepsilon^2\lambda^2}) |E| number of (reweighted) edges, and \varepsilon-spectrally-approximate the old graph.

As a consequence, to obtain a good sparsifier, the high level idea is to decompose the graph G into well-connected (large conductance) clusters A_1,\dots,A_k, such that the number of inter-cluster edges is \leq |E|/2. Then, we can do subsampling for each such cluster (using Theorem 6.1 above), and in the very end contract all clusters and recurse on a graph with \leq |E|/2 edges. Because there are only logarithmic number of levels, this gives a good sparsifier with almost linear number of edges, at least for unweighted graphs. The modifications needed for weighted graphs are not trivial, and elaborated in details in Section 10.2 of [ST08b], which I do not plan to cover.

BTW, graph decomposition is also called multiway partitioning in the old [ST04] paper.

Conductance: Two Different Notions

Before going into details, there is an important comment to make about conductance in graph G=(V,E). When talking about the conductance of a cut (S,V-S), it is simply \Phi^G(S) = \frac{Cut(S,V-S)}{\min\{vol(S), vol(V-S)\}}. Similarly we can also define that of an induced subgraph B\subseteq V, which is \Phi^G_B(S) = \frac{Cut(S,B-S)}{\min\{vol(S), vol(B-S)\}}. Now comes to an important notion, that is the conductance of a set B, it is \Phi^G_B := \min_{S\subset B} \Phi^G_B(S).

In short, the conductance of a cut measures the connectivity of a cut, while the conductance of a set B gives a lower bound of conductance for all possible cuts that divides the set B. I’d love to thank Adrian Vette and Vahab Mirrokni who made this point clear to me in a different project.

Ideal Graph Decomposition

Theorem 7.1 in this paper actually gives a good graph decomposition. It is a constructive proof but relies on an oracle to exact sparsest cut solver, which is NP-hard. It gives some insight to the final solution so I’d love to present its proof here.

Formally, Theorem 7.1 says says that for any graph G, one can decompose it into clusters A_1,\dots,A_k with at most |E|/2 inter-cluster edges and each cluster has conductance \Omega(1/\log n). This theorem relies on an important lemma:

Lemma 7.2. Given graph G with parameter \phi\leq 1, and given a subset B\subseteq V, let S\subset B be a set maximizing vol(S) among those satisfying: (C.1) vol(S)\leq vol(B)/2 and (C.2) \Phi_B^G(S)\leq \phi. Then, if vol(S)=\alpha \cdot vol(B) for \alpha \leq 1/3, then \Phi^G_{B-S} \geq \phi \cdot \frac{1-3\alpha}{1-\alpha}.

An interpretation of this Lemma 7.2 is that, if one finds the maximum set S with the conductance upper bound \phi, then the other side of the cut B-S will not only have a good cut conductance \leq \phi, but also have a good set conductance \geq \phi \cdot \frac{1-3\alpha}{1-\alpha}.

The main body of [ST08b] will actually use Lemma 7.2 for the case when \alpha is a very small constant, and the proof of Lemma 7.2 is just by contradiction: suppose B-S does not have a large set conductance, then one can find a cut R\subset B-S with small conductance, and one can show that R\cup S also satisfies \Phi_{B}^G(R\cup S) < \phi and contradicting the maximality of S.

Now comes to the final (but ideal) algorithm to decompose G by calling procedure idealDecomp(G,\phi). Procedure idealDecomp(B, \phi) first checks if \Phi_B^G \geq \phi, if so then return B. Otherwise, let S\subset B be the maximum subset satisfying Lemma 7.2. If vol(S)> vol(B)/4 then recurse on both sides by calling idealDecomp(B-S,\phi) and idealDecomp(S,\phi). If vol(S)\leq vol(B)/4, then recurse only on the smaller side, because the larger side is guaranteed by Lemma 7.2 to have a large conductance \Omega(\phi).

This algorithm has O(\log n) number of layers. On each layer, it will add |E|\phi number of edges to the cut. Since we will set \phi = \Theta(1/\log n), the final number of edges on the cut is upper bounded by |E|/2, so Theorem 7.1 is proved.

In my next post tomorrow, I am going to show the more exciting proof of constructing the same graph decomposition, in almost linear time using only approximate (but local) sparsest cut solvers from [ST08a].

This entry was posted in Algorithm, Readings. Bookmark the permalink.

Leave a comment