Ciphertext Compression | TFHE Digest #1

Furkan AkalFurkan Akal
5 min read

Welcome to TFHE Digest, a series of concise and hopefully insightful blog posts exploring the TFHE scheme and its implementation through the TFHE.rs library. This series is designed to dive into various aspects of TFHE—ranging from foundational concepts to advanced techniques and practical use cases—but without following a specific order. The topics will be intentionally random, inspired by conversations, questions, or interesting challenges that come up along the way. Whether you're a cryptography enthusiast or a developer exploring secure computation, there's something here for everyone.


Let’s recall LWE!

What is it?

In cryptography, Learning-With-Errors (LWE) is a mathematical problem that is widely used to create secure encryption algorithms.

Let \(\mathbb{Z}_q\) denote the ring of integers module \(q\) and \(\mathbb{Z}_q^n\) denote the set of \(n\)-vectors over \(\mathbb{Z}_q\). There exists a certain unknown linear function \(f: \mathbb{Z}_q^n \rightarrow \mathbb{Z_q}\), and the input to the LWE problem is a sample of pairs \((x, y)\), where \(x \in \mathbb{Z}_q^n\) and \(y \in \mathbb{Z}_q\), so that with high probability \(y = f(x)\).

The Learning-With-Errors (LWE) problem presents a computational challenge where solving linear equations becomes difficult due to added noise. Essentially, LWE relies on the idea that when linear equations over a finite field are slightly altered with random errors, finding the solution becomes computationally hard. This approach uses the difficulty of telling apart random noise from structured data to create cryptographic systems that are secure against quantum attacks. The noise is crucial because it hides the connection between the input vectors and their outputs, making it hard to deduce the secret linear function.

LWE Encryption

In the context of LWE-based encryption, the security model revolves around the difficulty of solving the LWE problem. The encryption scheme is constructed around a public and a private key, where the public key is derived from the private key with the addition of noise. This ensures that, while encryption can be performed by anyone possessing the public key, decryption is feasible only with the knowledge of the private key. The essence of this encryption lies in its ability to mask the message with a layer of computational complexity that is impractical to breach without the private key, thus ensuring confidentiality.

To encrypt an encoded message \(m\), we calculate:

$$b = \mathbf{a} \cdot \mathbf{s} + m + \epsilon$$

where \(\mathbf{a} \leftarrow_R \mathbb{Z}^n\) is a randomness, \(\mathbf{s} \in \mathbb{Z}^n\) is our secret key, and \(\epsilon \in \chi\) is a small noise.


LWE on the Torus (TLWE)

What is the torus in our context?

Real torus \(\mathbb{T} = \mathbb{R}/\mathbb{Z}\) is basically the set \([0, 1)\) of real numbers modulo 1. Notice that any two elements of \(\mathbb{T}\) can be added modulo 1 which forms an abelian group. But it is important to observe that it is not a ring as the internal product of torus elements is not defined.

In TFHE, we’re working on \(\mathbb{T}_q\), which can be represented as the set of fractions:

$$\{ \frac{i}{q} \ \text{mod} \ 1 \ | \ i \in \mathbb{Z} \} = \{ \frac{i}{q} \ | \ i \in \mathbb{Z}/q\mathbb{Z} \} = \{ 0, \frac{1}{q}, \frac{2}{q}, ... , \frac{q-1}{q} \}$$

This submodule \(\mathbb{T}_q\) is called a discretized torus.


TLWE Ciphertexts

Encryption

The security assumption for TLWE is that a torus element \((\sum_i^n s_i \cdot a_i + \epsilon ) \in \mathbb{T}_q\) cannot be distinguished from a random torus element \(r \in \mathbb{T}\), even if the torus vector \((a_1, ..., a_n) \in \mathbb{T}_q^n\) is known.

Basically, to encrypt a message \(m\) with TLWE, we calculate

$$b = \sum_i^n s_i \cdot a_i + m + \epsilon$$

where \((a_1, ..., a_n) \in \mathbb{T}_q^n\) is a random vector and \(\epsilon \in \chi\) is a small noise.

Similar to the case in the regular LWE, a TLWE ciphertext is of the form:

$$(a_1, a_2, ..., a_n, b) \in \mathbb{T}_q^{n+1}$$

Decryption

To decrypt a TLWE ciphertext \(c = (a_1, ..., a_n, b)\), we calculate

$$m^* = b - \sum_i^n s_i \cdot a_i$$

and return

$$m = \frac{\lfloor q \ m^* \rceil \text{mod} \ q}{q}$$

Plaintext & Ciphertext Spaces

In most cases, the plaintext and ciphertext spaces are distinct, where the plaintext space is an additive subgroup of \(\mathbb{T}_q\).

We, conventionally denote the plaintext space modulus by \(p\), the ciphertext space modulus by \(q\).

Example

Let’s assume \(p = 4\) and \(q = 64\). So, the plaintext space is \(P = \{0, \frac{1}{4}, \frac{2}{4}, \frac{3}{4}\}\), where the ciphertext space is \(Q = \mathbb{T}_{64} = \{ 0, \frac{1}{64}, \frac{2}{64}, ..., \frac{62}{64}, \frac{63}{64} \}\).

(Joye, 2022)

Notice that if \(\epsilon \in \{\frac{-7}{64}, ..., \frac{7}{64}\}\), then any noisy message value \(m^* = m + \epsilon\) corresponds to a plaintext message \(m \in P = \{ 0, \frac{16}{64}, \frac{32}{64}, \frac{48}{64} \}\).

Hence, while decrypting, we determine which zone in the plaintext space the noisy message falls into. For example, the closest plaintext message to \(m^* \in \{ \frac{9}{64}, ..., \frac{23}{64} \}\) is \(m = \frac{16}{64} = \frac{1}{4}\).


Ciphertext Compression

Compressing Ciphertexts in TFHE.rs

TFHE.rs lets developers compress ciphertexts at encryption time by calling the CompressedFheUint16::try_encrypt() method:

use tfhe::prelude::*;
use tfhe::{ConfigBuilder, generate_keys, CompressedFheUint16};

fn main() {
    let config = ConfigBuilder::default().build();
    let (client_key, _) = generate_keys(config);

    let clear = 12_837u16;
    let compressed = CompressedFheUint16::try_encrypt(clear, &client_key).unwrap();
    println!(
        "compressed size  : {}",
        bincode::serialize(&compressed).unwrap().len()
    );

    let decompressed = compressed.decompress();

    println!(
        "decompressed size: {}",
        bincode::serialize(&decompressed).unwrap().len()
    );

    let clear_decompressed: u16 = decompressed.decrypt(&client_key);
    assert_eq!(clear_decompressed, clear);
}

How does it work?

Recall that TLWE ciphertexts are torus vectors with \(n+1\) components.

If we suppose that torus elements are represented with 64 bits where \(n = 630\), a TLWE ciphertext requires

$$631 \times 64 = 40384 \ \text{bits}$$

for its representation.

Instead of representing a ciphertext as \(c = (a_1, ..., a_n, b)\), a more compact way to do it is to define \(c\) as \(c = (\theta, b)\) where \(\theta\) is a random bit string. Its value is used as a seed to a secure pseudo-random number generator to derive the random vector \((a_1, ..., a_n)\).

With this setup, the same ciphertext only needs \(128 + 64 = 192\) bits for its representation.

As we use a pseudo-random number generator, we are able to recover our random vector \(a\) deterministically. So, it makes more sense to store the PRNG seed instead of storing the entire randomness. Hence, the ciphertext compression decreases the storage overhead drastically.


References

  1. Joye, M. (2022). SoK: Fully homomorphic encryption over the [discretized] torus. IACR Transactions on Cryptographic Hardware and Embedded Systems, 2022(4), 661–692. https://doi.org/10.46586/tches.v2022.i4.661-692

  2. Zama AI. (n.d.). Compress. TFHE.rs Fundamentals. Retrieved December 27, 2024, from https://docs.zama.ai/tfhe-rs/fundamentals/compress

0
Subscribe to my newsletter

Read articles from Furkan Akal directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Furkan Akal
Furkan Akal