cudaErrorInitializationError (3)cudaErrorInitializationError (error code 3) occurs when the CUDA runtime fails to initialize. This happens at the first CUDA call in your program and indicates a fundamental problem with the CUDA environment. This error is a catch-all for initialization problems that don't fit more specific error codes. It can be caused by driver issues, environment problems, or system configuration. This guide covers systematic troubleshooting for CUDA initialization failures.
CUDA error: initialization error cudaErrorInitializationError: initialization error CUDA_ERROR_NOT_INITIALIZED cuInit(0) returned error CUDA driver failed to initialize
Check driver loads and GPU is recognized.
# Check driver module is loaded
lsmod | grep nvidia
# Check GPU is visible to driver
nvidia-smi
# If nvidia-smi fails:
# - Driver not installed: reinstall driver
# - Driver not loaded: sudo modprobe nvidia
# - After kernel update: rebuild/reinstall driver
# Check device nodes exist
ls -la /dev/nvidia*Ensure user can access GPU devices.
# Check device permissions
ls -la /dev/nvidia*
# Add user to video group
sudo usermod -aG video $USER
sudo usermod -aG render $USER # Some systems
# Logout and login, or:
newgrp video
# Temporary fix (use udev rules for permanent)
sudo chmod 666 /dev/nvidia*GPU may need reset after crash or hang.
# Check for GPU errors
dmesg | grep -i nvidia | tail -20
# Reset GPU (requires root)
sudo nvidia-smi --gpu-reset -i 0
# If reset fails, may need to unload/reload driver
sudo rmmod nvidia_uvm nvidia_drm nvidia_modeset nvidia
sudo modprobe nvidia
# Worst case: reboot
sudo rebootRemove all traces and reinstall.
# Remove existing nvidia packages
sudo apt remove --purge nvidia-*
sudo apt autoremove
# Remove any manually installed driver
sudo /usr/bin/nvidia-uninstall # If installed from .run file
# Clean up
sudo rm -rf /usr/local/cuda*
# Install fresh
sudo apt update
sudo apt install nvidia-driver-535 nvidia-cuda-toolkit
# Or use official installer
wget https://...nvidia...run
sudo sh NVIDIA-Linux-x86_64-535.xx.run
sudo rebootCheck and fix environment configuration.
# Check environment variables
echo $LD_LIBRARY_PATH
echo $PATH
echo $CUDA_HOME
# Clear potentially problematic variables
unset CUDA_VISIBLE_DEVICES # Might be hiding GPUs
unset CUDA_HOME
# Set correct paths
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
# Test
python -c "import torch; print(torch.cuda.is_available())"Driver not fully loaded after install.
# Run immediately after install without reboot
import torch
torch.cuda.is_available() # False or errorVerify driver works before using CUDA.
# After install/boot, verify environment
import subprocess
import torch
# Check driver first
result = subprocess.run(['nvidia-smi'], capture_output=True)
if result.returncode != 0:
print("Driver not working - check installation")
exit(1)
# Now CUDA should work
if torch.cuda.is_available():
print(f"CUDA works! Devices: {torch.cuda.device_count()}")
else:
print("CUDA not available - check torch CUDA build")Kernel updates can break NVIDIA driver. The driver must be rebuilt for new kernels. Reinstall driver after kernel updates, or use DKMS for automatic rebuild.
Check dmesg for GPU errors, verify nvidia-smi works. The GPU may have entered a bad state. Try nvidia-smi --gpu-reset or reboot.
Check environment variables (CUDA_VISIBLE_DEVICES, LD_LIBRARY_PATH). Different programs may have different environments. Virtualization or containers can also affect this.
Init failure can appear as no device
Version issues cause init failures
Library issues cause init failures
Need help debugging CUDA errors? Download RightNow AI for intelligent error analysis and optimization suggestions.