GPU Emulation

Test CUDA code on different GPUs without owning the hardware

Quick Start

GPU Emulation Panel

Getting Started with GPU Emulation

  1. Open GPU Emulation: Click the GPU Emulation icon in the sidebar (circuit board icon)
  2. Choose a GPU: Select the GPU you want to test from the list. Click the checkbox next to it
  3. Run Your Code: Open your .cu file and click the Build button. Your code runs on the selected GPU

Available GPUs

The emulator includes all major NVIDIA GPUs across different generations:

Latest Generation

  • H100, H200
  • RTX 4090, RTX 4080
  • L40S

Data Center

  • A100, A10
  • V100
  • T4

Gaming & Workstation

  • RTX 3090, RTX 3080
  • RTX 2080 Ti
  • GTX 1080 Ti

Legacy Systems

  • Tesla K80
  • GTX 1060
  • Earlier architectures

How to Select GPUs

GPU Selection List

Selection Methods

  1. Single GPU: Check one box to test on that GPU
  2. Multiple GPUs: Check multiple boxes to compare performance across different architectures
  3. Search: Type GPU name in search box to find it quickly
  4. Clear: Click "Clear Selection" to deselect all GPUs

What You'll See

Emulation Active Status

When emulation is active:

  • Status bar shows the selected GPU name
  • "Emulation Active" indicator appears
  • Performance results show estimated metrics
  • Architecture-specific warnings if code uses unsupported features

Best Uses

Development & Testing

  • Test code compatibility on different GPUs
  • Develop without expensive hardware
  • Compare performance across architectures
  • Learn CUDA on any computer

Architecture Exploration

  • Test Hopper features without H100 access
  • Evaluate tensor core performance
  • Understand compute capability differences
  • Plan for future GPU upgrades

Educational

  • Learn CUDA without GPU hardware
  • Experiment with different configurations
  • Understand architectural differences
  • Practice optimization techniques

Performance Analysis

  • Estimate performance on target hardware
  • Compare workloads across GPU generations
  • Identify architecture-specific bottlenecks
  • Validate optimization strategies

Emulation Features

Accurate Hardware Modeling

  • Compute Capability: Emulates correct CC version for each GPU
  • Memory Specs: Realistic memory bandwidth and capacity constraints
  • SM Count: Accurate streaming multiprocessor configuration
  • Register Limits: Per-thread and per-block register constraints

Performance Estimation

  • Execution Time: Estimated kernel runtime based on architecture
  • Occupancy: Calculate theoretical occupancy limits
  • Memory Throughput: Bandwidth and latency modeling
  • Bottleneck Analysis: Identify compute vs memory bound kernels

Compatibility Checking

  • Feature Detection: Warns about unsupported CUDA features
  • Architecture Warnings: Alerts for architecture-specific code paths
  • Compilation Validation: Ensures code compiles for target GPU
  • API Compatibility: Checks CUDA API version requirements

Tips and Limitations

Best Practices

  • Use emulation for compatibility testing and initial development
  • Compare performance trends across GPU generations
  • Validate architectural assumptions before hardware purchase
  • Test multiple GPUs to ensure broad compatibility

Important Limitations

  • Estimates Only: Emulation provides estimates. For production benchmarks, use real hardware
  • Performance Accuracy: Timing estimates are approximate and may vary from actual hardware
  • Real Hardware Preferred: Always validate critical optimizations on actual target GPUs
  • Architecture-Specific Features: Some advanced features may not be fully emulated

Next Steps: For real hardware testing, explore Remote GPU Execution to connect to cloud GPUs or your own remote servers.