An Introduction to Relative Entropy Coding and Its Applications


Gergely Flamich

31/07/2024

gergely-flamich.github.io

In Collaboration With

why care?

transform coding

transform_coding.png

realistic lossy compression

Right-hand image from Careil et al. [1]

differentially private federated learning

relative entropy coding

  1. \(X, Y \sim P_{X, Y}\)
  2. Given \(X \sim P_X\), encoder sends code \(C\) so that decoder can compute \(Y \sim P_{Y \mid X}\)
  3. Li and El Gamal [2]:

\[ {\color{red} I[X; Y]} \leq \mathbb{E}[|C|] \leq {\color{red} I[X; Y]} + {\color{blue} \log (I[X; Y] + 1) + 4} \]

  • short codelength

lossy source coding

Usually:

  • \(f^{-1} \circ \lfloor \cdot \rceil \circ f\)
  • \(f\) controls distortion

Now:

  • \(f^{-1} \circ (f + \epsilon)\)
  • \(f\) controls \(P_{Y \mid X}\)

common randomness

solution 1: dither

Common randomness: \(U \sim \mathrm{Unif}[-1/2, 1/2]\)

\[ \lfloor x + U \rceil - U \sim \mathrm{Unif}[x - 1/2, x + 1/2] \]

  • Encoder: receive \(X \sim P_X\), send \(K = \lfloor X + U\rceil\)
  • Decoder: compute \(Y = K - U\)
  • ✅ Fast
  • ❌ Can only simulate uniform distributions.

solution 2: selection sampling

  • ✅ Can simulate any distribution
  • ❌ Slow

Greedy Poisson Rejection Sampling

  • Sampling as search
  • Exploit structure to accelerate search
  • \(Q \gets P_{Y \mid X}\)
  • \(P \gets P_{Y}\)

GPRS with \(P = \mathcal{N}(0, 1), Q = \mathcal{N}(1, 1/16)\)

gprs_0.png

GPRS with \(P = \mathcal{N}(0, 1), Q = \mathcal{N}(1, 1/16)\)

gprs_1.png

GPRS with \(P = \mathcal{N}(0, 1), Q = \mathcal{N}(1, 1/16)\)

gprs_2.png

GPRS with \(P = \mathcal{N}(0, 1), Q = \mathcal{N}(1, 1/16)\)

gprs_3.png

GPRS with \(P = \mathcal{N}(0, 1), Q = \mathcal{N}(1, 1/16)\)

gprs_4.png

GPRS with \(P = \mathcal{N}(0, 1), Q = \mathcal{N}(1, 1/16)\)

gprs_5.png

GPRS with \(P = \mathcal{N}(0, 1), Q = \mathcal{N}(1, 1/16)\)

gprs_accept.png

Fast GPRS with \(P = \mathcal{N}(0, 1), Q = \mathcal{N}(1, 1/16)\)

fast_gprs_0.png

Fast GPRS with \(P = \mathcal{N}(0, 1), Q = \mathcal{N}(1, 1/16)\)

fast_gprs_1.png

Fast GPRS with \(P = \mathcal{N}(0, 1), Q = \mathcal{N}(1, 1/16)\)

fast_gprs_2.png

Fast GPRS with \(P = \mathcal{N}(0, 1), Q = \mathcal{N}(1, 1/16)\)

fast_gprs_3.png

Fast GPRS with \(P = \mathcal{N}(0, 1), Q = \mathcal{N}(1, 1/16)\)

fast_gprs_4.png

Fast GPRS with \(P = \mathcal{N}(0, 1), Q = \mathcal{N}(1, 1/16)\)

fast_gprs_5.png

Analysis of faster GPRS

Now, encode search path \(\pi\).

\(\mathbb{H}[\pi] \leq I[X; Y] + \log(I[X; Y] + 1) + \mathcal{O}(1)\)

\(\mathbb{E}[\lvert\pi\rvert] = I[X; Y] + \mathcal{O}(1)\)

This is optimal.

Computationally Lightweight ML-based data compression

Data Compression with INRs

Image from Dupont et al. [4]

  • computationally lightweight
  • short codelength

COMBINER

COMpression with Bayesian Implicit Neural Representations

Image from Blundell et al. [7]

💡Gradient descent is the transform!

COMBINER

COMBINER

Theory: What next?

  • Might not need perfect solution: think of error correcting codes (e.g. LDPC)
  • Exploit different types of structure
  • Duality between source and channel coding

Applications: What next?

  • Realism constraints for INR-based compression
  • Scale to high-resolution/high-volume data

Take-home messages

  • Relative entropy coding is a stochastic alternative to quantization for lossy source coding
  • Currently two flavours: dither-based, and selection sampling
  • Greedy Poisson rejection sampling is an optimal selection sampler
  • Implicit neural represenations are an exciting, compute-efficient approach to data compression with huge potential

References I

  • [1] Careil, M., Muckley, M. J., Verbeek, J., & Lathuilière, S. Towards image compression with perfect realism at ultra-low bitrates. ICLR 2024.
  • [2] C. T. Li and A. El Gamal, “Strong functional representation lemma and applications to coding theorems,” IEEE Transactions on Information Theory, vol. 64, no. 11, pp. 6967–6978, 2018.
  • [3] E. Agustsson and L. Theis. "Universally quantized neural compression" In NeurIPS 2020.

References II

  • [4] E. Dupont, A. Golinski, M. Alizadeh, Y. W. Teh and Arnaud Doucet. "COIN: compression with implicit neural representations" arXiv preprint arXiv:2103.03123, 2021.
  • [5] G. F., L. Wells, Some Notes on the Sample Complexity of Approximate Channel Simulation. To appear at Learning to Compress workshop @ ISIT 2024.
  • [6] D. Goc, G. F. On Channel Simulation with Causal Rejection Samplers. To appear at ISIT 2024

References III

  • [7] C. Blundell, J. Cornebise, K. Kavukcuoglu and D. Wierstra. Weight uncertainty in neural network. In ICML 2015.