Latest updates and improvements to RightNow AI

CLI Swarm Agent for generating production-ready CUDA/Triton kernels. Up to 5x faster than torch.compile() with 97.6% correctness rate.

Profile, benchmark, and emulate PyTorch kernels directly in the editor. Same workflow as CUDA, Triton, TileLang, and CUTE.

RightNow AI now supports Triton, TileLang, and CUTE alongside native CUDA, with intelligent documentation retrieval that understands your GPU and code context.

Full macOS compatibility with Metal GPU detection for Apple Silicon. Multi-GPU profiling to compare GPU vs GPU side-by-side.

Cycle-accurate GPU emulation with 96-98% accuracy. No physical GPU required. AI automatically iterates and optimizes kernels to peak performance.

Code anywhere, run everywhere. Connect to remote GPUs with SSH and cloud providers.

Profile any CUDA kernel without a physical GPU. Choose from 86+ GPU architectures.

Full benchmarking terminal with visual kernel comparisons and instant CodeLens insights.

Comprehensive benchmarking with execution time, memory bandwidth, occupancy, and multi-GPU support.

Support for 15+ AI providers including local models.

First public release of RightNow AI.