## Spielman-Teng-B: Spectral Sparsification of Graphs (Part I)

Spielman-Teng’s 2004 paper [ST04] on almost-linear Laplacian linear solver has been divided and re-written into three very technically involved papers [ST08a], [ST08b] and [ST08c]. I am planning to write memos for all of them, while this post is the Part I for the second paper [ST08b: Spectral Sparsification of Graphs].

## Outline

The goal of [ST08b] is to construct a re-weighted subgraph of $G$ with $\tilde{O}(n/\varepsilon^2)$ edges that is a $(1+\varepsilon)$-spectral-approximation to $G$. This will be used later in [ST08c] to construct an ultra-sparsifier. It uses [ST08a] as a subroutine to construct decompositions.

A useful fact shown in Theorem 6.1 (which I am not going to present in details) is that for a graph with smallest non-zero normalized Laplacian eigenvalue $\lambda$, one can simply sample edges with probability roughly proportional to $1/\lambda^2$, so that the new graph has $O(\frac{\log ^2 n}{\varepsilon^2\lambda^2}) |E|$ number of (reweighted) edges, and $\varepsilon$-spectrally-approximate the old graph.

As a consequence, to obtain a good sparsifier, the high level idea is to decompose the graph $G$ into well-connected (large conductance) clusters $A_1,\dots,A_k$, such that the number of inter-cluster edges is $\leq |E|/2$. Then, we can do subsampling for each such cluster (using Theorem 6.1 above), and in the very end contract all clusters and recurse on a graph with $\leq |E|/2$ edges. Because there are only logarithmic number of levels, this gives a good sparsifier with almost linear number of edges, at least for unweighted graphs. The modifications needed for weighted graphs are not trivial, and elaborated in details in Section 10.2 of [ST08b], which I do not plan to cover.

BTW, graph decomposition is also called multiway partitioning in the old [ST04] paper.

## Conductance: Two Different Notions

Before going into details, there is an important comment to make about conductance in graph $G=(V,E)$. When talking about the conductance of a cut $(S,V-S)$, it is simply $\Phi^G(S) = \frac{Cut(S,V-S)}{\min\{vol(S), vol(V-S)\}}$. Similarly we can also define that of an induced subgraph $B\subseteq V$, which is $\Phi^G_B(S) = \frac{Cut(S,B-S)}{\min\{vol(S), vol(B-S)\}}$. Now comes to an important notion, that is the conductance of a set $B$, it is $\Phi^G_B := \min_{S\subset B} \Phi^G_B(S)$.

In short, the conductance of a cut measures the connectivity of a cut, while the conductance of a set $B$ gives a lower bound of conductance for all possible cuts that divides the set $B$. I’d love to thank Adrian Vette and Vahab Mirrokni who made this point clear to me in a different project.

## Ideal Graph Decomposition

Theorem 7.1 in this paper actually gives a good graph decomposition. It is a constructive proof but relies on an oracle to exact sparsest cut solver, which is NP-hard. It gives some insight to the final solution so I’d love to present its proof here.

Formally, Theorem 7.1 says says that for any graph $G$, one can decompose it into clusters $A_1,\dots,A_k$ with at most $|E|/2$ inter-cluster edges and each cluster has conductance $\Omega(1/\log n)$. This theorem relies on an important lemma:

Lemma 7.2. Given graph $G$ with parameter $\phi\leq 1$, and given a subset $B\subseteq V$, let $S\subset B$ be a set maximizing $vol(S)$ among those satisfying: (C.1) $vol(S)\leq vol(B)/2$ and (C.2) $\Phi_B^G(S)\leq \phi$. Then, if $vol(S)=\alpha \cdot vol(B)$ for $\alpha \leq 1/3$, then $\Phi^G_{B-S} \geq \phi \cdot \frac{1-3\alpha}{1-\alpha}$.

An interpretation of this Lemma 7.2 is that, if one finds the maximum set $S$ with the conductance upper bound $\phi$, then the other side of the cut $B-S$ will not only have a good cut conductance $\leq \phi$, but also have a good set conductance $\geq \phi \cdot \frac{1-3\alpha}{1-\alpha}$.

The main body of [ST08b] will actually use Lemma 7.2 for the case when $\alpha$ is a very small constant, and the proof of Lemma 7.2 is just by contradiction: suppose $B-S$ does not have a large set conductance, then one can find a cut $R\subset B-S$ with small conductance, and one can show that $R\cup S$ also satisfies $\Phi_{B}^G(R\cup S) < \phi$ and contradicting the maximality of $S$.

Now comes to the final (but ideal) algorithm to decompose $G$ by calling procedure idealDecomp($G,\phi$). Procedure idealDecomp($B, \phi$) first checks if $\Phi_B^G \geq \phi$, if so then return $B$. Otherwise, let $S\subset B$ be the maximum subset satisfying Lemma 7.2. If $vol(S)> vol(B)/4$ then recurse on both sides by calling idealDecomp($B-S,\phi$) and idealDecomp($S,\phi$). If $vol(S)\leq vol(B)/4$, then recurse only on the smaller side, because the larger side is guaranteed by Lemma 7.2 to have a large conductance $\Omega(\phi)$.

This algorithm has $O(\log n)$ number of layers. On each layer, it will add $|E|\phi$ number of edges to the cut. Since we will set $\phi = \Theta(1/\log n)$, the final number of edges on the cut is upper bounded by $|E|/2$, so Theorem 7.1 is proved.

In my next post tomorrow, I am going to show the more exciting proof of constructing the same graph decomposition, in almost linear time using only approximate (but local) sparsest cut solvers from [ST08a].