Write GPU kernels in CUDA, Triton, CUTE, or TileLang with full IDE support
RightNow AI supports four GPU kernel languages out of the box, giving you the flexibility to choose the right tool for your workflow:
Native C++ for maximum control
Files: .cu, .cuh
Python DSL for rapid prototyping
Files: .py with @triton.jit
NVIDIA CUTLASS templates for production GEMM
Files: .cu, .cuh with cute::
Tile-based abstractions for readable code
Files: .py with @T.Kernel
OpenAI's Triton brings Python-level productivity to GPU programming. RightNow AI provides full IDE support:
@triton.jit decorators and tl.* functionstl.load, tl.store, tl.dot, etc.)Real-time metrics displayed in CodeLens without execution:
num_warps - Number of warps per blocknum_stages - Software pipelining stagesBLOCK_SIZE - Block size constantRequirements: Triton must be installed in your Python environment (pip install triton)
NVIDIA's CUTLASS/CUTE provides production-grade templates for matrix operations with Tensor Core support:
cute:: namespace and template syntax__global__ kernel definitionsTILE_M, TILE_N, TILE_K - Tile dimensionsRightNow AI automatically detects CUTLASS from:
CUTLASS_PATH, CUTLASS_HOMEC:\cutlass, /usr/local/cutlass./cutlass/includeMicrosoft's TileLang provides clean, tile-based abstractions for GPU programming:
@T.Kernel and @tl.kernel decoratorsblock_size - Thread block sizetile_size - Tile dimensionsRequirements: TileLang must be installed in your Python environment (pip install tilelang)
| Feature | CUDA | Triton | CUTE | TileLang |
|---|---|---|---|---|
| Syntax Highlighting | Full | Full | Full | Full |
| Hover Documentation | Full | 100+ | 25+ | 35+ |
| Go-to-Definition | Full | Full | Full | Full |
| Static Analysis | Full | Full | Full | Full |
| NCU Profiling | Full | Full | Full | Full |
| Benchmarking | Full | Full | Full | Full |
All four DSLs support real NCU profiling directly from the editor:
nvcc and profiled with NCUncu --target-processes allLearn more: See Real-Time Profiling for detailed profiling documentation and Static Analysis for instant metrics without execution.