Qualcomm has unveiled a new near-memory AI architecture called HBC (Hybrid Bonding Cache), along with the AI250 and AI350 accelerators, claiming a breakthrough in addressing the memory wall that has long constrained AI performance. The company touts a 6x improvement in bandwidth-per-watt compared to traditional HBM (High Bandwidth Memory) solutions, alongside a 200x increase in capacity relative to on-chip SRAM.

This architectural shift could reshape how AI inference workloads are handled, particularly for edge and data center deployments where power efficiency and memory bandwidth are critical bottlenecks. By placing compute logic closer to memory, Qualcomm aims to reduce data movement energy—a major factor in AI chip power consumption.

Specific details on the AI250 and AI350 accelerators remain limited, with Qualcomm providing only these performance claims without disclosing precise specifications or benchmark data. The company has not announced a timeline for commercialization or pricing for the new architecture and accelerators.

The announcement positions Qualcomm against established AI chip leaders like NVIDIA and emerging competitors in the near-memory computing space. If these claims hold up under real-world testing, the technology could offer significant efficiency gains for AI inference, though independent verification is needed.

Industry analysts caution that bandwidth-per-watt improvements in controlled demonstrations do not always translate to system-level gains, and the lack of concrete benchmark data raises questions about the architecture's practical performance in production environments.