Apr 11, 2023

# Concentration Bound

Concentration bound studies the probability that a random variable $$X$$ is bounded within a certain range (concentration). Namely, it studies the probability $$\mathrm{P}(X\le a)$$ and $$\mathrm{P}(X\ge a)$$ for some constant $$a$$.

## Markov's Inequality

The Markov's Inequality claims that for non-negative random variable $$X$$ with expectation $$\mathrm{E}(X)$$ and some $$a>0$$, the probability we seek is bounded by

$\mathrm{P}(X\ge a)\le\frac{\mathrm{E}(X)}{a}.$

The intuition is that to maximize $$\mathrm{P}(X\ge a)$$ for some non-negative $$X$$ with expectation $$\mathrm{E}(X)$$, the best approach is to make $$X=a$$ with probability $$\frac{\mathrm{E}(X)}{a}$$ and $$X=0$$ otherwise. This gives an expectation of $$\mathrm{E}(X)$$.

## Chebyshev's Inequality

When $$X$$ is not non-negative, a easy way to work around is to notice that $$X^2$$ is non-negative and

$\mathrm{P}(X^2\ge a^2)\le\frac{\mathrm{E}(X^2)}{a^2}.$

If we center $$X$$ with respect to its expectation $$\mathrm{E}(X)$$, we get the Chebyshev's Inequality

$\mathrm{P}\left(\left|X-\mathrm{E}(X)\right|\ge a\right)=\mathrm{P}\left(\left(X-\mathrm{E}(X)\right)^2\ge a^2\right)\le\frac{\mathrm{Var}(X)}{a^2}.$

## Chernoff Bound

Since exponential functions are always non-negative, a more advanced technique to derive a more precise bound is to substitute $$X\ge a$$ with $$e^{tX}\ge e^{ta}$$ for some $$t>0$$. This give the following bound:

For every $$t>0$$

$\mathrm{P}(X\ge a)=\mathrm{P}(e^{tX}\ge e^{ta})\le\frac{\mathrm{E}(e^{tX})}{e^{ta}}.$

In practice this means we can pick some $$t$$ such that this inequality gives the tightest bound.

### Practice Problem

Assume I keep tossing a coin until I see $$n$$ heads. Denote $$S_n$$ as the number of coin tosses I make. Upper bound

$\mathrm{P}(S_n-2n\ge l)$

for some $$l>0$$.

#### Solution

$\mathrm{P}(S_n-2n\ge l)\le\frac{\mathrm{E}(e^{tS_n})}{e^{t(l+2n)}}.$

Observe that

$\mathrm{E}(e^{tS_n})=\mathrm{E}(e^{tS_1})^n=\left(\sum_{i=1}^\infty 2^{-i}e^{ti}\right)^n=(\frac{e^t}{2-e^t})^n .$

Now plug the result back and let $$y(t)=\frac{\mathrm{E}(e^{tS_n})}{e^{t(l+2n)}}=e^{-t(l+n)}(2-e^t)^{-n}$$.

We want to find $$\min_t y(t)$$, so it makes sense to solve for $$y'(t)=0$$. We can do so in Mathematica:

Reduce[D[E^(-t (l + n)) (2 - E^t)^(-n), t] == 0 && l > 0 && n >= 1 && 0 <= t <= Log, t]

We obtain the result $$t=\log \left(\frac{2 (l+n)}{l+2 n}\right)$$. Plug it back and we get the result

$\mathrm{P}(S_n-1.5n\ge l)\le 2^{-l-n} \left(\frac{l+n}{l+2 n}\right)^{-l-n} \left(2-\frac{2 (l+n)}{l+2 n}\right)^{-n}.$