Faster Comp-Geometry via Optimization

If one wants to compute the minimum enclosing ball (MinEB) of a set of points, would you believe that the running time can be improved by a significant factor if we randomly rotate the space? While seemingly very counter-intuitive because a ball is still a ball after rotation, my authors and I proved that this is TRUE!

More specifically, we connect MinEB (and some other problem) to optimization, and developed much faster algorithms based on stochastic gradient ideas originally used for SVM or Lasso in machine learning. One ingredient of this method is this aforementioned random rotation. This is another big surprise to me because we can now declare another victory for optimization geeks against classical (such as geometry-based) methods.

Problem Definitions and Prior Approaches

• In MaxIB, we are given a d-dimensional polyhedron defined by m halfspaces, and each halfspace is characterized by a linear constraint $\langle A_j, x \rangle + b_j \geq 0$ for $j\in [m]$. We want to find $(1-\varepsilon)$-approximate maximum inscribed ball.
• In MinEB,  we are given a set $\{a_1,a_2,\dots,a_n\} \subseteq \mathbb{R}^{d}$ of points, and want to find $(1+\varepsilon)$ approximate minimum enclosing ball.

Of course interior point based methods can solve such problems, but we are here interested in iterative algorithms with running time polynomial in $\varepsilon$ and linear in the input dimension $md$ or $nd$ because such algorithms are suitable for big-data scenarios.

For MaxIB, Xie, Snoeyink, and Xu obtained an algorithm with running time $\tilde{O}(m d \alpha^3 / \varepsilon^3)$, where $\alpha$ is the aspect ratio of the polyhedron. Their algorithm is based on lots of geometric observations: instead of saying what their algorithm really is, I’ll just paste below their Figure 4 in the paper:

For MinEB, the most popular algorithm is from core-set, and gives a running time $O(n d / \varepsilon)$. Again, without defining what it is, let me paste below a geometric illustration of coreset from a Duke university website

Notice that for MinEB, a recent result gives an optimization-based method for solving it. While the authors have not made their running time explicit, we derived in our paper that their running time is in fact $\tilde{O}(n d / \sqrt{\varepsilon})$, which is already faster than the best known geometry-based one using coreset.

Our Results

In our paper “Optimization Algorithms for Faster Computational Geometry“, we improved

• MaxIB from $\tilde{O}(m d \alpha^3 / \varepsilon^3)$ to $\tilde{O}(md + m \sqrt{d} \alpha / \varepsilon)$, a speed-up up to $\tilde{O}(\sqrt{d} \alpha^2 / \varepsilon^2)$, and
• MinEB from $\tilde{O}(n d / \sqrt{\varepsilon})$ to $\tilde{O}(nd + n \sqrt{d} / \sqrt{\varepsilon})$, a speed-up up to $\tilde{O}(\sqrt{d})$

Our improvements should be considered significant no matter which area you come from:

• In theoretical computer science, one usually views $\alpha$ and $\varepsilon$ as constants so our improvement can be seen as $\tilde{\Omega}(\sqrt{d})$.
• In operations research or statistics, one usually concentrate on the convergence rate which is the $\varepsilon$ dependence. Our improvement on MinEB is from $1/\varepsilon^3$ to $1/\varepsilon$.
• For practitioners, I guarantee you that our method performs much much faster than known results. I tested it very quickly myself using random data, but I don’t have time to conduct a full experiment. (Contact me if you’re willing to help me out.)

Our Technique and An Optimization Method

Our crucial observation in the paper is that both problems, MinEB and MaxIB, can be reduced to the following convex-concave saddle point problem:

$\max_{x\in \mathbb{R} ^d} \min_{y\in \Delta_m} \frac1d y^T A x + \frac1d y^T b +\lambda H(y)-\frac{\gamma}{2} \| x\|_2^2 \enspace,$

Above, each row of matrix $A$ either corresponds to the normal vector of a halfplane in MaxIB, or a vector point in MinEB; $\Delta_m$ is the set of m-dimensional probability vectors, $\gamma, \lambda$ are two constants, and $H(\cdot)$ is the entropy function. We call this $\ell_1-\ell_2$ saddle point problem because it is strongly convex/concave with respect to the $\ell_1$ or $\ell_2$ norms respectively on the primal side (i.e., with $x$) and on the dual side (i.e., with $y$).

For experts in machine learning or optimization, he may immediately notice that such a saddle-point formulation is almost the same as ERM problems (such as SVM, Lasso, logistic regression, see my survey here.) Therefore, one should expect a variant of such state-of-the-art stochastic ERM methods to help also in computational geometry.

Indeed, we developed a stochastic saddle-point method that basically, instead of computing $Ax$ or $A^T y$ per iteration, samples a random row of the matrix $A$. Then, we made our method accelerated, in the same spirit of Nesterov’s accelerated gradient descent. The whole details are not very simple for a non-optimization expert, but all the ideas come purely from optimization and we can basically forget about the geometry background!

One Big Surprise

Our final running time of the new stochastic saddle-point method, for some natural optimization reasons, depends on the largest entries of matrix $A$. In other words, if a vector in MinEB is of the form $(1,0,0,\dots)$ then our running time is poor; but if we randomly rotate the space, all vectors must have absolute coordinates at most $\tilde{O}(1/\sqrt{d})$ in the standard basis, and our algorithm converges much faster in such scenarios!

This is the key idea behind our factor $\sqrt{d}$ improvement in the running time for both MaxIB and MinEB, and answers my first question of this post that is, if we randomly rotate the space, a properly designed algorithm (like ours) can indeed perform faster!