Loading...
cudaErrorOutOfMemory (2)cudaErrorOutOfMemory occurs when GPU memory is completely exhausted. This comprehensive guide covers memory profiling, optimization strategies, and prevention techniques.
CUDA error: out of memory cudaErrorOutOfMemory: out of memory
Use nvidia-smi and torch.cuda.memory_stats() to identify memory consumers.
nvidia-smi -l 1
# Or in Python:
import torch
print(torch.cuda.memory_summary())Lower batch size and use gradient accumulation.
# Gradient accumulation
for i, batch in enumerate(loader):
loss = model(batch) / accumulation_steps
loss.backward()
if (i + 1) % accumulation_steps == 0:
optimizer.step()
optimizer.zero_grad()Use FP16 to halve memory usage.
from torch.cuda.amp import autocast, GradScaler
scaler = GradScaler()
with autocast():
output = model(input)Free unused memory.
torch.cuda.empty_cache()
import gc
gc.collect()Loading entire model and large batch causes OOM.
model = LargeModel().cuda()
output = model(huge_batch) # OOM!Gradient checkpointing and mini-batches reduce memory.
model = LargeModel().cuda()
model.gradient_checkpointing_enable()
for mini_batch in split(batch, 4):
output = model(mini_batch)Use nvidia-smi or torch.cuda.memory_allocated().
Use gradient accumulation to maintain effective batch size.
Need help debugging CUDA errors? Download RightNow AI for intelligent error analysis and optimization suggestions.