Carnegie Mellon's Albert Gu and Princeton's Tri Dao have released Mamba-3, the latest iteration of their State Space Model architecture designed to challenge the computational limitations of Transformer-based AI models. The model is available under Apache 2.0 open source license for commercial use, with a technical paper published on arXiv.org.

Mamba-3 represents what researchers call an "inference-first" design philosophy, shifting focus from training efficiency to solving the "cold GPU" problem where modern hardware sits idle during decoding operations, waiting for memory movement rather than performing computation. This contrasts with Mamba-2, which concentrated on breaking pretraining bottlenecks.