How much GPU memory do I need for LLM?

When people first start working with large language models, one of the most common questions is: “How much GPU memory do I need?” This isn’t always obvious, since running a model involves more than just storing its weights. That’s where the Simple LLM VRAM Calculator comes in. Instead of leaving users to guess, it turns the process into a straightforward learning experience with just two inputs: the size of the model and the precision format.

The first concept to understand is model parameters. A model’s size—often described as “7B,” “13B,” or “70B”—refers to the number of parameters it contains. Each parameter must be stored in GPU memory, and the storage size depends on the numerical precision used. For example, FP32 (32-bit floating point) requires 4 bytes per parameter, while FP16 (16-bit floating point) uses only 2 bytes. That means switching from FP32 to FP16 instantly cuts memory use in half. The calculator applies this relationship directly: Model Parameters × Bytes per Parameter = Base Memory (the “From” value).

However, loading the model’s parameters alone isn’t enough to actually run inference. Extra memory is consumed by activations, CUDA kernels, and workspace buffers, as well as by small inefficiencies such as fragmentation. To make the estimates practical, the calculator multiplies the base parameter memory by a safety factor of 1.2, giving the “To” value. This range—from minimum to upper bound—helps users see both the theoretical storage need and a more realistic “all-in” requirement.

To make this concrete, consider a 70-billion parameter model in FP16. Each parameter needs 2 bytes, so the base requirement is about 140 GB. Once overhead is factored in, the calculator outputs ~168 GB. In contrast, running the same model in INT8 (1 byte per parameter) would need only ~70 GB minimum, or ~84 GB with overhead. Through such comparisons, users quickly see how much precision format affects feasibility on their GPUs.

0
Subscribe to my newsletter

Read articles from Dmitry Noranovich directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Dmitry Noranovich
Dmitry Noranovich