Continuous distribution entropy

One way to generalize the concept of entropy to continuous distributions is to introduce uncertainty when measuring a sample point from a continuous distribution. Without such measurement noise, a non-degenerate continuous distribution has infinite entropy, as each sample point is a random real number.

To do so, a family of noise distributions is introduced, where \(\eta_x\) is the noise distribution associated with measuring the real number \(x\), so that the actual observed value is distributed according to \(\eta_x\).

It is then straightforward to obtain a formula for the entropy of a continuous distribution \(f\), by making use of the KL-divergence \(D_{\text{KL}}\):

$$H_\eta(f) = \int_{-\infty}^{\infty} \!\!\!\!\!{\small\text{d}x}\; f(x)\, D_{\text{KL}}(\eta_x\,||\,f)$$

This can be expanded to:

$$H_\eta(f) = \int_{-\infty}^{\infty} \!\!\!\!\!{\small\text{d}x}\; f(x) \int_{-\infty}^{\infty} \!\!\!\!\!{\small\text{d}y}\;\, \eta_x(y)\log\left(\frac{\eta_x(y)}{f(y)} \right)$$

By abuse of notation, when \(\eta\) is a single distribution, \(H_\eta\) is understood to use the family defined by \(\eta_x(y)=\eta(y-x)\).

The formula for differential entropy \(h\) can be recovered by taking various limits of \(H\):

$$h(f) \;=\; \lim_{\epsilon \rightarrow 0}\left(H_{\mathcal{N}_\epsilon}(f)-h(\mathcal{N}_\epsilon) \right)\;=\;\lim_{\epsilon \rightarrow \infty}\left(\frac{1}{2}\log(2\pi\epsilon)-H_{f}(\mathcal{N}_\epsilon)\right)$$

where \(\mathcal{N}_\epsilon\) is the normal distribution with mean \(0\) and variance \(\epsilon\).

0
Subscribe to my newsletter

Read articles from Stephane Bersier directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Stephane Bersier
Stephane Bersier