Continuous distribution entropy
One way to generalize the concept of entropy to continuous distributions is to introduce uncertainty when measuring a sample point from a continuous distribution. Without such measurement noise, a non-degenerate continuous distribution has infinite entropy, as each sample point is a random real number.
To do so, a family of noise distributions is introduced, where \(\eta_x\) is the noise distribution associated with measuring the real number \(x\), so that the actual observed value is distributed according to \(\eta_x\).
It is then straightforward to obtain a formula for the entropy of a continuous distribution \(f\), by making use of the KL-divergence \(D_{\text{KL}}\):
$$H_\eta(f) = \int_{-\infty}^{\infty} \!\!\!\!\!{\small\text{d}x}\; f(x)\, D_{\text{KL}}(\eta_x\,||\,f)$$
This can be expanded to:
$$H_\eta(f) = \int_{-\infty}^{\infty} \!\!\!\!\!{\small\text{d}x}\; f(x) \int_{-\infty}^{\infty} \!\!\!\!\!{\small\text{d}y}\;\, \eta_x(y)\log\left(\frac{\eta_x(y)}{f(y)} \right)$$
By abuse of notation, when \(\eta\) is a single distribution, \(H_\eta\) is understood to use the family defined by \(\eta_x(y)=\eta(y-x)\).
The formula for differential entropy \(h\) can be recovered by taking various limits of \(H\):
$$h(f) \;=\; \lim_{\epsilon \rightarrow 0}\left(H_{\mathcal{N}_\epsilon}(f)-h(\mathcal{N}_\epsilon) \right)\;=\;\lim_{\epsilon \rightarrow \infty}\left(\frac{1}{2}\log(2\pi\epsilon)-H_{f}(\mathcal{N}_\epsilon)\right)$$
where \(\mathcal{N}_\epsilon\) is the normal distribution with mean \(0\) and variance \(\epsilon\).
Subscribe to my newsletter
Read articles from Stephane Bersier directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by