5.4 C
New York
Wednesday, March 29, 2023

An enchancment to Bennett’s inequality for the Poisson distribution


If {lambda>0}, a Poisson random variable {{bf Poisson}(lambda)} with imply {lambda} is a random variable taking values within the pure numbers with chance distribution

displaystyle  {bf P}( {bf Poisson}(lambda) = k) = e^{-lambda} frac{lambda^k}{k!}.

One is commonly thinking about bounding higher tail possibilities

displaystyle  {bf P}( {bf Poisson}(lambda) geq lambda(1+u))

for {u geq 0}, or decrease tail possibilities

displaystyle  {bf P}( {bf Poisson}(lambda) leq lambda(1+u))

for {-1 < u leq 0}. A typical instrument for that is Bennett’s inequality:

Proposition 1 (Bennett’s inequality) One has

displaystyle  {bf P}( {bf Poisson}(lambda) geq lambda(1+u)) leq exp(-lambda h(u))

for {u geq 0} and

displaystyle  {bf P}( {bf Poisson}(lambda) leq lambda(1+u)) leq exp(-lambda h(u))

for {-1 < u leq 0}, the place

displaystyle  h(u) := (1+u) log(1+u) - u.

From the Taylor growth {h(u) = frac{u^2}{2} + O(u^3)} for {u=O(1)} we conclude Gaussian kind tail bounds within the regime {u = o(1)} (and specifically when {u = O(1/sqrt{lambda})} (within the spirit of the Chernoff, Bernstein, and Hoeffding inequalities). however within the regime the place {u} is giant and optimistic one obtains a slight achieve over these different classical bounds (of {exp(- lambda u log u)} kind, reasonably than {exp(-lambda u)}).

Proof: We use the exponential second methodology. For any {t geq 0}, we now have from Markov’s inequality that

displaystyle  {bf P}( {bf Poisson}(lambda) geq lambda(1+u)) leq e^{-t lambda(1+u)} {bf E} exp( t {bf Poisson}(lambda) ).

A typical computation exhibits that the second producing operate of the Poisson distribution is given by

displaystyle  exp( t {bf Poisson}(lambda) ) = exp( (e^t - 1) lambda )

and therefore

displaystyle  {bf P}( {bf Poisson}(lambda) geq lambda(1+u)) leq exp( (e^t - 1)lambda - t lambda(1+u) ).

For {u geq 0}, it seems that the right-hand aspect is optimized by setting {t = log(1+u)}, during which case the right-hand aspect simplifies to {exp(-lambda h(u))}. This proves the primary inequality; the second inequality is confirmed equally (however now {u} and {t} are non-positive reasonably than non-negative). Box

Comment 2 Bennett’s inequality additionally applies for (suitably normalized) sums of bounded impartial random variables. In some instances there are direct comparability inequalities out there to narrate these variables to the Poisson case. As an illustration, suppose {S = X_1 + dots + X_n} is the sum of impartial Boolean variables {X_1,dots,X_n in {0,1}} of whole imply {sum_{j=1}^n {bf E} X_j = lambda} and with {sup_i {bf P}(X_i) leq varepsilon} for some {0 < varepsilon < 1}. Then for any pure quantity {k}, we now have

displaystyle  {bf P}(S=k) = sum_{1 leq i_1 < dots < i_k leq n} {bf P}(X_{i_1}=1) dots {bf P}(X_{i_k}=1)

displaystyle  prod_{i neq i_1,dots,i_k} {bf P}(X_i=0)

displaystyle  leq frac{1}{k!} (sum_{i=1}^n frac{{bf P}(X_i=1)}{{bf P}(X_i=0)})^k times prod_{i=1}^n {bf P}(X_i=0)

displaystyle  leq frac{1}{k!} (frac{lambda}{1-varepsilon})^k prod_{i=1}^n exp( - {bf P}(X_i = 1))

displaystyle  leq e^{-lambda} frac{lambda^k}{(1-varepsilon)^k k!}

displaystyle  leq e^{frac{varepsilon}{1-varepsilon} lambda} {bf P}( mathbf{Poisson}(frac{lambda}{1-varepsilon}) = k).

As such, for {varepsilon} small, one can effectively management the tail possibilities of {S} when it comes to the tail chance of a Poisson random variable of imply near {lambda}; that is in fact very carefully associated to the well-known incontrovertible fact that the Poisson distribution emerges because the restrict of sums of many impartial boolean variables, every of which is non-zero with small chance. See this paper of Bentkus and this paper of Pinelis for some additional helpful (and fewer apparent) comparability inequalities of this kind.

On this be aware I needed to file the statement that one can enhance the Bennett certain by a small polynomial issue as soon as one leaves the Gaussian regime {u = O(1/sqrt{lambda})}, specifically gaining an element of {1/sqrt{lambda}} when {u sim 1}. This statement will not be troublesome and is implicitly within the literature (one can extract it as an illustration from the way more common outcomes of this paper of Talagrand, and the essential thought already seems in this paper of Glynn), however I used to be not capable of finding a clear model of this assertion within the literature, so I’m inserting it right here on my weblog. (But when a reader is aware of of a reference that mainly incorporates the certain under, I’d be comfortable to know of it.)

Proposition 3 (Improved Bennett’s inequality) One has

displaystyle  {bf P}( {bf Poisson}(lambda) geq lambda(1+u)) ll frac{exp(-lambda h(u))}{sqrt{1 + lambda min(u, u^2)}}

for {u geq 0} and

displaystyle  {bf P}( {bf Poisson}(lambda) leq lambda(1+u)) ll frac{exp(-lambda h(u))}{sqrt{1 + lambda u^2 (1+u)}}

for {-1 < u leq 0}.

Proof: We start with the primary inequality. We might assume that {u geq 1/sqrt{lambda}}, since in any other case the declare follows from the same old Bennett inequality. We increase out the left-hand aspect as

displaystyle  e^{-lambda} sum_{k geq lambda(1+u)} frac{lambda^k}{k!}.

Observe that for {k geq lambda(1+u)} that

displaystyle  frac{lambda^{k+1}}{(k+1)!} leq frac{1}{1+u} frac{lambda^{k+1}}{(k+1)!} .

Thus the sum is dominated by the primary time period occasions a geometrical collection {sum_{j=0}^infty frac{1}{(1+u)^j} = 1 + frac{1}{u}}. We are able to thus certain the left-hand aspect by

displaystyle  ll e^{-lambda} (1 + frac{1}{u}) sup_{k geq lambda(1+u)} frac{lambda^k}{k!}.

By the Stirling approximation, that is

displaystyle  ll e^{-lambda} (1 + frac{1}{u}) sup_{k geq lambda(1+u)} frac{1}{sqrt{k}} frac{(elambda)^k}{k^k}.

The expression contained in the supremum is lowering in {k} for {k > lambda}, thus we are able to certain it by

displaystyle  ll e^{-lambda} (1 + frac{1}{u}) frac{1}{sqrt{lambda(1+u)}} frac{(elambda)^{lambda(1+u)}}{(lambda(1+u))^{lambda(1+u)}},

which simplifies to

displaystyle  ll frac{exp(-lambda h(u))}{sqrt{1 + lambda min(u, u^2)}}

after a routine calculation.

Now we flip to the second inequality. As earlier than we might assume that {u leq -1/sqrt{lambda}}. We first get rid of a degenerate case during which {lambda(1+u) < 1}. Right here the left-hand aspect is simply

displaystyle  {bf P}( {bf Poisson}(lambda) = 0 ) = e^{-lambda}

and the right-hand aspect is corresponding to

displaystyle  e^{-lambda} exp( - lambda (1+u) log (1+u) + lambda(1+u) ) / sqrt{lambda(1+u)}.

Since {-lambda(1+u) log(1+u)} is damaging and {0 < lambda(1+u) < 1}, we see that the right-hand aspect is {gg e^{-lambda}}, and the estimate holds on this case.

It stays to think about the regime the place {u leq -1/sqrt{lambda}} and {lambda(1+u) geq 1}. The left-hand aspect expands as

displaystyle  e^{-lambda} sum_{k leq lambda(1+u)} frac{lambda^k}{k!}.

The sum is dominated by the primary time period occasions a geometrical collection {sum_{j=-infty}^0 frac{1}{(1+u)^j} = frac{1}}. The maximal {k} is corresponding to {lambda(1+u)}, so we are able to certain the left-hand aspect by

displaystyle  ll e^{-lambda} frac{1} sup_{lambda(1+u) ll k leq lambda(1+u)} frac{lambda^k}{k!}.

Utilizing the Stirling approximation as earlier than we are able to certain this by

displaystyle  ll e^{-lambda} frac{1} frac{1}{sqrt{lambda(1+u)}} frac{(elambda)^{lambda(1+u)}}{(lambda(1+u))^{lambda(1+u)}},

which simplifies to

displaystyle  ll frac{exp(-lambda h(u))}{sqrt{1 + lambda u^2 (1+u)}}

after a routine calculation. Box

The identical evaluation may be reversed to indicate that the bounds given above are mainly sharp as much as constants, not less than when {lambda} (and {lambda(1+u)}) are giant.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles