LLM VRAM Calculator

Estimate GPU memory requirements for LLM training and inference

GPU Model

Single GPU memory estimation

HuggingFace Model ID

Advanced Model Config

Mode

Training Inference

Precision

BF16 recommended for training on modern GPUs (Ampere+). FP16 for older GPUs (V100, T4).

Batch Size

1 1024

Sequence Length

128 131072

Gradient Checkpointing

Optimizer

Mixed Precision (FP32 master weights)

Keep FP32 copy of weights for stability. Recommended for FP16, optional for BF16.

LoRA Enabled

LoRA Rank

4 256

DDP (Multi-GPU) - adds gradient buffer for sync

torch.compile - adds ~10% for compiled graphs