Epsilon-approximations

In this post, we review the technique of $\varepsilon$-approximation, which is a very useful tool in computational geometry to design efficient approximation algorithms. Consider the following problem:

Problem 1 (Range Counting). Given a set $P$ of $n$ points, we want to preprocess $P$ into a structure so that for any query range $\gamma\in\Gamma$, for some collection of ranges $\Gamma$, we can efficiently count $|P\cap\Gamma|$.

It turns out that Problem 1 is very difficult. Efficient (linear space and polyloagrithmic query time)  data structures are only possible when $\Gamma$ is a collection of orthogonal ranges. Even for the case of halfspaces, the current best near-linear space data structure requires roughly $O(n^{1-1/d})$ query time. So instead, researchers turn to the approximate version of the problem.

Problem 2 (Approximate Range Counting). Given a set $P$ of $n$ points and an error parameter $\varepsilon$, we want to preprocess $P$ into a structure so that for any query range $\gamma\in\Gamma$, for some collection of ranges $\Gamma$, we can efficiently count $(1+\varepsilon)|P\cap\Gamma|$.

The natural idea to solve Problem 2 is to take random samples. It seems natural that the sample size should depend on the error $\varepsilon$, the complexity of $\Gamma$, as well as the size of $P$. (Surprisingly, as we will see, this intuition is somewhat incorrect, the sample size we need does not depend on the set size $|P|$!) We will model the complexity $\Gamma$ using the notion of VC-dimensions. More specifically, we consider set systems $(P, \Gamma)$ in the problem. Now we define the notion of $\varepsilon$-approximations for set systems, which is a synonym to a ''good random sample'' in our setting.

Definition 1 ($\varepsilon$-approximation). Given a set system $\mathscr{S} = (S, \mathscr{R})$, an $\varepsilon$-approximation for $\mathscr{S}$ is a set $A\subset S$ such that

\[ \Big|\frac{\mathcal{R} \cap A}{A} - \frac{\mathcal{R}\cap S}{S}\Big| \le \varepsilon, \]

for all $\mathcal{R}\in\mathscr{R}$.

Note that for a set system $\Gamma = (P, \Gamma)$ of a range counting problem. To approximate $\gamma\cap P$ for any $\gamma\in\Gamma$, it suffices to count $\gamma\in A$ and the error will be bounded by $(1+\varepsilon)|\gamma\cap P|$.

But how large could $A$ be? If we want efficient approximate range counting structures, $A$ should not be too large. We now bound the size of $A$ for set systems $\mathscr{S} = (S, \mathscr{R})$ with bounded VC-dimension. We will need the following lemma of the Chernoff bound.

Lemma 1 (Chernoff Bound). Let $X=\sum_{i=1}^nX_i$ where each $X_i$ are independent indicator random variable such that $\Pr[X_i=1]=p$ and $\Pr[X_i=0]=1-p$ and $\mu=\mathbb{E}[X]$. Then

\[ \Pr[|X-\mu| \ge \varepsilon] \le e^{-\Theta(\varepsilon^2/\mu)}. \]

We first show a weaker bound. The main idea is to take a random sample of $S$.

Theorem 1. For any set system $\mathscr{S} = (S,\mathscr{R})$ with VC-dimension $c$ for some constant $c$, there is an $\varepsilon$-approximation $A$ for $\mathscr{S}$ of size $O\left( \frac{1}{\varepsilon^2}\ln n\right)$.

Proof. We take a random sample $A=\{A_1,\cdots,A_r\}$ of size $r = c'\frac{1}{\varepsilon^2}\log\frac{1}{\varepsilon}$ of $\mathscr{S}$ for a big enough constant $c'$. Consider any $\mathcal{R}\in\mathscr{R}$. Let $X_i$ be an indicator random variable such that

\[X_i=\begin{cases}1, \textrm{ if } A_i\in\mathcal{R} \\ 0, \textrm{ otherwise} \end{cases}.\]

Note that $\Pr[X_i=1]=\frac{|\mathcal{R}\cap S|}{|S|}$ and so $\mu=\mathbb{E}[X]=\frac{|A||\mathcal{R}\cap S|}{|S|}$, where $X=\sum_{i=1}^rX_i$. Note that

\[ \begin{align} \Pr\left[\Big|\frac{|\mathcal{R}\cap A|}{|A|} - \frac{|\mathcal{R}\cap S|}{|S|}\Big|\ge\varepsilon \right] &= \Pr\left[\Big||\mathcal{R}\cap A| - \frac{|A||\mathcal{R}\cap S|}{|S|}\Big|\ge|A|\varepsilon \right] \\ &= \Pr\left[||\mathcal{R}\cap A| - \mu|\ge|A|\varepsilon \right] \\ &\le e^{-\Theta((|A|\varepsilon)^2/\mu)} = e^{-\Theta((|A|\varepsilon)^2/\mu)} = e^{-\Theta(\frac{|S|\log n}{|\mathcal{R}\cap S|})} \\ &\le e^{-\Theta(\ln n)} \le n^{-2c}, \end{align} \]

where we the last inequality follows by picking $c'$ to be a big enough constant.

Since $\mathscr{S}$ has VC-dimension $c$, $|\mathscr{R}| < n^c$. Then by the union bound, with high probability ($1-n^{-c}$), $A$ is an $\varepsilon$-approximation for $\mathscr{S}$. 

$\square$

We now show that we can improve the bound in Theorem 1 to $O\left( \frac{1}{\varepsilon^2}\ln \frac{1}{\varepsilon}\right)$. To prove this, we need the following Lemma.

Lemma 2. If $A_1$ is an $\varepsilon$-approximation of set system $\mathscr{S}$ and $A_2$ is an $\varepsilon'$-approximation of $\mathscr{A_1}$ where $\mathscr{A_1}=(A_1, \{\mathcal{R}\cap A_1:\mathcal{R}\in\mathscr{R}\})$ is the induced set system of $\mathscr{S}$ on $A_1$. Then $A_2$ is an $(\varepsilon+\varepsilon')$-approximation of $\mathscr{S}$.

Assuming Lemma 2, we now prove the following bound for $\varepsilon$-approximations.

Theorem 2. For any set system $\mathscr{S} = (S,\mathscr{R})$ with VC-dimension $c$ for some constant $c$, there is an $\varepsilon$-approximation $A$ for $\mathscr{S}$ of size $O\left( \frac{1}{\varepsilon^2}\log\frac{1}{\varepsilon} \right)$.

Proof. The main idea behind Theorem 2 is to take samples over samples repetitively with increasing $\varepsilon$. Let $A_0=S$, we take $\varepsilon_i = \sqrt{\frac{\ln |A_{i-1}|/2}{|A_{i-1}|/2}}$-approximation $A_i$ over $A_{i-1}$ for $i=1,2,\cdots,\log\frac{\varepsilon}{t}$ for some parameter $t$ we set later. 

By Theorem 1, the size of each $A_i$ is

\[|A_i|=\frac{1}{\varepsilon_i}\sqrt{\ln |A_{i-1}|} = \frac{|A_{i-1}|/2}{\ln |A_{i-1}|/2}\sqrt{\ln |A_{i-1}|} \le \frac{|A_{i-1}|}{2}.\]

Thus $|A_t|=O(n/2^t)$. Now we bound the error.

By Lemma 2, the total $\varepsilon_{\textrm{total}}$ we get satisfies

\[\varepsilon_{\textrm{total}} = \sum_{i=1}^t\varepsilon_i = O\left(\sum_{i=1}^t\sqrt{\frac{\ln n/2^i}{n/2^i}}\right) = O\left(\sqrt{\frac{\ln n/2^t}{n/2^t}}\right).\]

Set $\sqrt{\frac{\ln n/2^t}{n/2^t}}=O(\varepsilon)$, we get

\[t=O\left(\log \frac{\varepsilon^{-2}\log\varepsilon^{-1}}{n}\right).\]

Then $|A_t|=O(\frac{1}{\varepsilon^2}\ln\frac{1}{\varepsilon})$.

$\square$

Remark 1. The independence between the $\varepsilon$-approximation size and the set size is very interesting. We only need a very small sample size even if the ground set is very big, but the sample is still very accurate in terms of approximation range counting purposes. 

Remark 2. The size bound of $\varepsilon$-approximations can be improved for specific set systems. Specifically, for orthogonal ranges, the size can be improved to $\tilde O(\frac{1}{\varepsilon})$, and for halfspaces, the size can be improved to $\tilde O(\varepsilon^{-2d/(d+1)})$ [MK'17].


References:

[MK'17] Mustafa, Nabil, and Kasturi Varadarajan. "Epsilon-approximations and epsilon-nets." Handbook of Discrete and Computational Geometry. Chapman and Hall/CRC, 2017.

Comments

Popular Posts