╭────────────╮
│ ANALYSIS │
├────────────┤
│▸__global__ │
│▸ kernel() │
│▸ {...} │
│ PTX ✓ │
│ SASS ✓ │
╰────────────╯See what the GPU actually executes. Inspect PTX and SASS assembly with source correlation, analyze register usage, and detect race conditions—all without leaving your editor.
Your C++ code isn't what runs on the GPU. PTX and SASS show you what the compiler actually generated—often surprising.
Race conditions and missing barriers cause non-deterministic bugs. Static analysis catches them before they hit production.
High register pressure kills occupancy. See exactly how many registers each thread uses and where the compiler spills.
View the intermediate PTX and final SASS assembly side-by-side with your source code. Click any line to see the corresponding assembly.
// Your CUDA code float sum = a[idx] + b[idx]; // Generated SASS (SM_86) LDG.E.SYS R4, [R2] ; Load a[idx] LDG.E.SYS R5, [R3] ; Load b[idx] FADD R6, R4, R5 ; Add floats STG.E.SYS [R7], R6 ; Store result
Static analysis detects potential race conditions in shared memory and missing synchronization barriers—before you even run the code.
__shared__ float smem[256]; smem[tid] = input[tid]; // ⚠ Missing __syncthreads() output[tid] = smem[tid + 1]; // Race!
High register usage limits how many threads can run concurrently. See exactly how many registers each function uses and where the compiler spills to local memory.
Register Usage ├─ matmul_naive 48 regs → 50% occupancy ├─ matmul_tiled 32 regs → 75% occupancy └─ matmul_wmma 24 regs → 100% occupancy Spills detected: 0
Code Analysis is free with RightNow.