Static Analysis

See CUDA optimization tips without running code

What It Does

CodeLens automatically analyzes your CUDA code and shows:

  • Execution time and GPU efficiency
  • Register and shared memory usage
  • Occupancy calculations
  • Performance bottlenecks

How to Use

  1. Write Your CUDA Code: Open a .cu file and write your kernel
  2. Save the File: Press Ctrl+S - CodeLens updates automatically
  3. See Performance Metrics: Look above your kernel for two lines of performance data
  4. Fix Any Issues: If you see warnings, the CodeLens tells you what's limiting performance

What You See

CodeLens Display

Two lines appear above each kernel:

Line 1 - Runtime Performance:

text
✅ addVectors: 18.5ms • SM:95.3% • Occ:87.5% • Mem:245.8GB/s

Line 2 - Static Analysis:

text
⚠️ Static: Registers: 64 • Shared: 8KB • Max occupancy: 50% (reg-limited)

Performance Indicators

  • Green: Excellent performance
  • Yellow: Can be optimized
  • Red: Performance issue needs attention

Understanding the Metrics

Runtime Metrics

  • Time: How fast your kernel runs (ms)
  • SM: GPU utilization percentage
  • Occ: Thread occupancy percentage
  • Mem: Memory bandwidth (GB/s)

Static Metrics

  • Registers: Per-thread register count
  • Shared: Shared memory per block
  • Limiting factor: What's preventing better occupancy

Common Issues and Fixes

"reg-limited"

Too many registers per thread:

  • Simplify calculations
  • Use shared memory for arrays
  • Reduce local variables

"smem-limited"

Too much shared memory:

  • Reduce shared array sizes
  • Use dynamic shared memory
  • Process data in smaller tiles

Register Spilling

CodeLens shows spill warnings:

  • Reduce register pressure
  • Split complex kernels
  • Use compiler flags to limit registers

Best Practices: Keep occupancy above 50%, fix red warnings first, save file to update metrics, and use build settings for PTX analysis.

Remember: CodeLens shows estimates. Always benchmark to verify actual performance.