Spielman-Teng-B: Spectral Sparsification of Graphs (Part I)

Spielman-Teng’s 2004 paper [ST04] on almost-linear Laplacian linear solver has been divided and re-written into three very technically involved papers [ST08a], [ST08b] and [ST08c]. I am planning to write memos for all of them, while this post is the Part I for the second paper [ST08b: Spectral Sparsification of Graphs].

Outline

The goal of [ST08b] is to construct a re-weighted subgraph of $G$ with $\tilde{O}(n/\varepsilon^2)$ edges that is a $(1+\varepsilon)$ -spectral-approximation to $G$ . This will be used later in [ST08c] to construct an ultra-sparsifier. It uses [ST08a] as a subroutine to construct decompositions.

A useful fact shown in Theorem 6.1 (which I am not going to present in details) is that for a graph with smallest non-zero normalized Laplacian eigenvalue $\lambda$ , one can simply sample edges with probability roughly proportional to $1/\lambda^2$ , so that the new graph has $O(\frac{\log ^2 n}{\varepsilon^2\lambda^2}) |E|$ number of (reweighted) edges, and $\varepsilon$ -spectrally-approximate the old graph.

As a consequence, to obtain a good sparsifier, the high level idea is to decompose the graph $G$ into well-connected (large conductance) clusters $A_1,\dots,A_k$ , such that the number of inter-cluster edges is $\leq |E|/2$ . Then, we can do subsampling for each such cluster (using Theorem 6.1 above), and in the very end contract all clusters and recurse on a graph with $\leq |E|/2$ edges. Because there are only logarithmic number of levels, this gives a good sparsifier with almost linear number of edges, at least for unweighted graphs. The modifications needed for weighted graphs are not trivial, and elaborated in details in Section 10.2 of [ST08b], which I do not plan to cover.

BTW, graph decomposition is also called multiway partitioning in the old [ST04] paper.

Conductance: Two Different Notions

Before going into details, there is an important comment to make about conductance in graph $G=(V,E)$ . When talking about the conductance of a cut $(S,V-S)$ , it is simply $\Phi^G(S) = \frac{Cut(S,V-S)}{\min\{vol(S), vol(V-S)\}}$ . Similarly we can also define that of an induced subgraph $B\subseteq V$ , which is $\Phi^G_B(S) = \frac{Cut(S,B-S)}{\min\{vol(S), vol(B-S)\}}$ . Now comes to an important notion, that is the conductance of a set $B$ , it is $\Phi^G_B := \min_{S\subset B} \Phi^G_B(S)$ .

In short, the conductance of a cut measures the connectivity of a cut, while the conductance of a set $B$ gives a lower bound of conductance for all possible cuts that divides the set $B$ . I’d love to thank Adrian Vette and Vahab Mirrokni who made this point clear to me in a different project.

Ideal Graph Decomposition

Theorem 7.1 in this paper actually gives a good graph decomposition. It is a constructive proof but relies on an oracle to exact sparsest cut solver, which is NP-hard. It gives some insight to the final solution so I’d love to present its proof here.

Formally, Theorem 7.1 says says that for any graph $G$ , one can decompose it into clusters $A_1,\dots,A_k$ with at most $|E|/2$ inter-cluster edges and each cluster has conductance $\Omega(1/\log n)$ . This theorem relies on an important lemma:

Lemma 7.2. Given graph $G$ with parameter $\phi\leq 1$ , and given a subset $B\subseteq V$ , let $S\subset B$ be a set maximizing $vol(S)$ among those satisfying: (C.1) $vol(S)\leq vol(B)/2$ and (C.2) $\Phi_B^G(S)\leq \phi$ . Then, if $vol(S)=\alpha \cdot vol(B)$ for $\alpha \leq 1/3$ , then $\Phi^G_{B-S} \geq \phi \cdot \frac{1-3\alpha}{1-\alpha}$ .

An interpretation of this Lemma 7.2 is that, if one finds the maximum set $S$ with the conductance upper bound $\phi$ , then the other side of the cut $B-S$ will not only have a good cut conductance $\leq \phi$ , but also have a good set conductance $\geq \phi \cdot \frac{1-3\alpha}{1-\alpha}$ .

The main body of [ST08b] will actually use Lemma 7.2 for the case when $\alpha$ is a very small constant, and the proof of Lemma 7.2 is just by contradiction: suppose $B-S$ does not have a large set conductance, then one can find a cut $R\subset B-S$ with small conductance, and one can show that $R\cup S$ also satisfies $\Phi_{B}^G(R\cup S) < \phi$ and contradicting the maximality of $S$ .

Now comes to the final (but ideal) algorithm to decompose $G$ by calling procedure idealDecomp( $G,\phi$ ). Procedure idealDecomp( $B, \phi$ ) first checks if $\Phi_B^G \geq \phi$ , if so then return $B$ . Otherwise, let $S\subset B$ be the maximum subset satisfying Lemma 7.2. If $vol(S)> vol(B)/4$ then recurse on both sides by calling idealDecomp( $B-S,\phi$ ) and idealDecomp( $S,\phi$ ). If $vol(S)\leq vol(B)/4$ , then recurse only on the smaller side, because the larger side is guaranteed by Lemma 7.2 to have a large conductance $\Omega(\phi)$ .

This algorithm has $O(\log n)$ number of layers. On each layer, it will add $|E|\phi$ number of edges to the cut. Since we will set $\phi = \Theta(1/\log n)$ , the final number of edges on the cut is upper bounded by $|E|/2$ , so Theorem 7.1 is proved.

In my next post tomorrow, I am going to show the more exciting proof of constructing the same graph decomposition, in almost linear time using only approximate (but local) sparsest cut solvers from [ST08a].