RightNow Research Lab
Enabling Model-Hardware Co-Design at Scale
Abstract
We build the tools that close the gap between AI models and the hardware they run on, giving engineers and enterprises the infrastructure to go from model to optimized production inference, faster.
Our products are used by teams at
1Products
We maintain infrastructure and developer tools for GPU programming, inference, and deployment.
RightNow Editor
CUDA-focused editor with profiling, PTX/SASS inspection, emulation, and remote GPU workflows.

Forge
Inference optimization work around kernels, runtimes, correctness checks, and benchmarked variants.
2Research
Work on sparse attention, dynamic model adaptation, world models, token-level inference, and automated GPU-kernel search. Papers are available on arXiv.
3Open Source
We maintain open-source systems for agent infrastructure, edge inference, and automated kernel optimization.
OpenFang
Rust system for low-level agent execution with direct access to operating-system and GPU interfaces.
PicoLM
Small C inference runtime for memory-constrained edge devices.
AutoKernel
Kernel-search system that profiles PyTorch models, tests Triton variants, and uses measured bottlenecks to choose follow-up experiments.






