Static Analysis
See CUDA optimization tips without running code
What It Does
CodeLens automatically analyzes your CUDA code and shows:
- Execution time and GPU efficiency
- Register and shared memory usage
- Occupancy calculations
- Performance bottlenecks
How to Use
- Write Your CUDA Code: Open a .cu file and write your kernel
- Save the File: Press Ctrl+S - CodeLens updates automatically
- See Performance Metrics: Look above your kernel for two lines of performance data
- Fix Any Issues: If you see warnings, the CodeLens tells you what's limiting performance
What You See
CodeLens Display
Two lines appear above each kernel:
Line 1 - Runtime Performance:
text
✅ addVectors: 18.5ms • SM:95.3% • Occ:87.5% • Mem:245.8GB/s
Line 2 - Static Analysis:
text
⚠️ Static: Registers: 64 • Shared: 8KB • Max occupancy: 50% (reg-limited)
Performance Indicators
- Green: Excellent performance
- Yellow: Can be optimized
- Red: Performance issue needs attention
Understanding the Metrics
Runtime Metrics
- Time: How fast your kernel runs (ms)
- SM: GPU utilization percentage
- Occ: Thread occupancy percentage
- Mem: Memory bandwidth (GB/s)
Static Metrics
- Registers: Per-thread register count
- Shared: Shared memory per block
- Limiting factor: What's preventing better occupancy
Common Issues and Fixes
"reg-limited"
Too many registers per thread:
- Simplify calculations
- Use shared memory for arrays
- Reduce local variables
"smem-limited"
Too much shared memory:
- Reduce shared array sizes
- Use dynamic shared memory
- Process data in smaller tiles
Register Spilling
CodeLens shows spill warnings:
- Reduce register pressure
- Split complex kernels
- Use compiler flags to limit registers
Best Practices: Keep occupancy above 50%, fix red warnings first, save file to update metrics, and use build settings for PTX analysis.
Remember: CodeLens shows estimates. Always benchmark to verify actual performance.