Nvidia has expanded Vera Rubin from a next-generation chip platform into a broader infrastructure launch, describing it as seven chips and five racks assembled into what it called “one giant supercomputer” for pretraining, post-training, test-time scaling and real-time agentic inference, meaning AI systems that can reason and take multistep actions.
In practical terms, Nvidia is pitching Rubin less as a standalone chip and more as a packaged AI system that combines compute, networking, storage and control hardware.
Announced at Nvidia’s AI conference in San Jose, the shift aims to move discrete servers toward integrated rack-scale and POD-scale AI systems.
Seven chips, five racks, one platform
Nvidia said the Vera Rubin platform combines the Vera CPU, Rubin GPU, NVLink 6 switch, ConnectX-9 SuperNIC, BlueField-4 DPU, a processor for managing data movement and infrastructure tasks, Spectrum-6 Ethernet switch and the newly integrated Groq 3 LPU, an inference-focused processor.
Those parts are deployed across Vera Rubin NVL72 GPU racks, Vera CPU racks, Groq 3 LPX inference racks, BlueField-4 STX storage racks and Spectrum-6 SPX Ethernet racks, with partner availability starting in the second half of 2026.
The AI factory rationale and what Vera was built for
Nvidia explains that AI factories are becoming always-on systems in which power, silicon and data are continuously turned into “intelligence at scale,” and argues that performance now depends on data movement, memory, utilization and reliability as much as peak GPU compute.
It says that Vera was built for data movement and agentic processing in AI factories, using 88 custom Olympus cores, 176 threads, up to 1.2 TB/s of memory bandwidth and 1.8 TB/s of NVLink-C2C bandwidth.
Performance claims and the power management additions
Vera Rubin NVL72 integrates 72 Rubin GPUs and 36 Vera CPUs, while the separate Vera CPU rack integrates 256 liquid-cooled Vera CPUs for reinforcement learning and agentic AI environments used to test and validate model outputs.
Nvidia also said the NVL72 rack can train large mixture-of-experts models with one-fourth the number of GPUs compared with Blackwell and deliver up to 10x higher inference throughput per watt at one-tenth the cost per token, while Vera delivers results twice as efficiently and 50% faster than traditional CPUs.
Nvidia said BlueField-4 powers its Inference Context Memory Storage Platform for long-context, multi-turn agentic AI, with up to 5x greater power efficiency than traditional storage.
According to the AI firm, DSX Max-Q can enable 30% more AI infrastructure inside a fixed-power data center, while DSX Flex is designed to make AI factories grid-flexible and unlock stranded grid capacity, which Nvidia quantified at up to 100 gigawatts.
Where the Vera CPU fits alongside AMD and Intel host processors
The CPU role is also clearer in the broader market context. AMD says EPYC 9005 is designed to act as a host CPU for GPU-accelerated AI systems, handling orchestration, synchronization and data preparation around large GPU workloads.
Intel says Xeon 6 with P-cores is a host CPU option for AI-accelerated systems, and in May 2025 said one Xeon 6 part was already serving as the host CPU in Nvidia’s DGX B300.
Nvidia’s own MGX materials also say its modular architecture supports Vera, x86 and other Arm CPU servers, indicating that Nvidia continues to support multiple CPU architectures across deployments.
The revenue context the launch sits inside
The company reported fourth-quarter fiscal 2026 data center revenue of $62.3 billion, up 75% from a year earlier, and said full-year data center revenue rose 68% to $193.7 billion.