NVIDIA Vera Is Here — The First CPU Built From the Ground Up for Agentic AI

NVIDIA Vera Is Here — The First CPU Built From the Ground Up for Agentic AI
📑 Table of Contents

NVIDIA Vera Is Here — The First CPU Built From the Ground Up for Agentic AI

“Agentic AI is creating a new CPU moment in the AI factory.”

That’s how Ian Buck, NVIDIA’s Vice President and one of the company’s longest-tenured leaders, described the moment he personally hand-delivered the first NVIDIA Vera CPUs to the world’s most important AI labs.

The deliveries happened over two days in mid-May 2026. On Friday, Buck arrived at Anthropic in San Francisco’s Hayes Valley, OpenAI in Mission Bay, and SpaceXAI in Palo Alto. On Monday, he reached Oracle Cloud Infrastructure in Santa Clara. At each stop, he carried a bare Vera CPU motherboard as a guide — and the AI world hasn’t stopped talking about it since.

This is not an incremental hardware refresh. Vera is NVIDIA’s first custom CPU — a ground-up architecture designed for the unique demands of agentic AI, orchestration, and autonomous reasoning. It signals a pivotal shift in how the industry thinks about compute for the agent era.


What Makes Vera Different?

Until now, the AI industry’s hardware story has been almost entirely about GPUs — massive parallel processors optimized for training and inference. But agentic AI introduces a fundamentally different workload profile:

Requirement Traditional AI Agentic AI
Primary task Matrix multiplications Orchestration, tool-calling, code execution
Memory pattern Large, predictable batches Sporadic, branching, multi-step
Latency sensitivity Moderate High (agents waiting for responses)
Context management Fixed prompt windows Long, evolving state across tool calls
Reinforcement learning Offline training Online, continuous feedback loops

Vera tackles these challenges head-on. Its 88 custom NVIDIA Olympus cores deliver 50% better per-core performance under full load compared to traditional server CPUs, while drawing significantly less power. The 1.2 TB/s memory bandwidth — more than 3x what typical server CPUs offer — means agents can shuttle massive context windows between memory and compute without bottlenecks.

“Vera is purpose-built to keep that work moving at scale” — Ian Buck, NVIDIA

The Silicon Behind the Agent Revolution

Vera is not just a standalone CPU — it’s the host processor for the upcoming Vera Rubin NVL72 system, where it pairs with NVIDIA’s next-generation Rubin GPUs via NVLink-C2C interconnect. This unified memory architecture means the CPU and GPU share a coherent memory space, eliminating the data-transfer overhead that plagues traditional CPU+GPU setups.

The practical impact is immediate for agentic workloads:

  • Orchestration loops — Agents that plan, execute, observe, and re-plan benefit from Vera’s ability to keep thousands of parallel decision threads alive simultaneously
  • Tool-calling — Generating and executing Python code, making API calls, and parsing JSON responses are CPU-bound operations that Vera accelerates dramatically
  • Reinforcement learning — SpaceXAI is already evaluating Vera for RL-based agent training pipelines, where the CPU handles simulation rollouts and reward computation while GPUs handle model inference
  • Long-context state management — Agents maintaining multi-hour conversations or iterating through large codebases can keep full context in the 1.2 TB/s memory path

The Handoff That Made Headlines

The delivery itself became a minor Silicon Valley legend. At Anthropic, James Bradbury (Head of Compute) received the system and immediately began discussing how Vera’s architecture maps to Claude’s agentic workloads. Buck’s bare motherboard tour included a detailed walkthrough of the cache hierarchy, memory controllers, and I/O fabric.

At OpenAI, Buck met with Sachin Katti (Head of Compute Infrastructure) on an open-air balcony. Katti’s team dove into system internals — at one point, Buck reportedly pulled out a screwdriver for a live teardown.

Elon Musk at SpaceXAI asked detailed questions about core counts, memory layout, and cooling solutions. SpaceXAI runs massive agent-based simulation pipelines for autonomous systems, and Vera’s combination of CPU throughput and energy efficiency is a direct match.

“Scaling compute is an important accelerant for the growth of models. We’re excited to see Vera emerge as a promising part of the ecosystem when solving for agentic workloads.” — James Bradbury, Anthropic

Oracle Cloud: The First Hyperscale Home for Vera

The most consequential delivery might have been the last. Oracle Cloud Infrastructure announced plans to deploy hundreds of thousands of NVIDIA Vera CPUs beginning in 2026, making OCI the first hyperscale cloud provider to offer Vera-based agentic AI infrastructure.

Karan Batta, OCI’s Product Management Lead, explained: “Agentic AI demands sustained performance at massive scale. Vera gives us the CPU architecture to deliver that.”

For enterprise customers, this means production-grade agentic AI infrastructure becomes available as a cloud service — no need to build and manage custom hardware clusters.

Why This Matters for the Agent Ecosystem

The arrival of Vera signals something deeper than a product launch. It’s a recognition that the age of agentic AI has infrastructure requirements fundamentally different from what came before. As our Complete Guide to AI Agents details, the shift from stateless inference to stateful orchestration is driving hardware innovation at every layer of the stack.

When AI models simply answered questions, the hardware requirements were straightforward: inference needed GPUs, training needed clusters. But agents are different. They plan, execute code, call APIs, manage state, handle errors, and iterate. These are CPU-intensive orchestration tasks that GPUs were never designed for.

NVIDIA’s thesis with Vera is that the CPU is the new bottleneck in agentic AI — and they’ve built a purpose-designed solution.

The Road Ahead

Vera is currently being evaluated across the top AI labs in the world. Early benchmarks are expected in the coming weeks. Meanwhile, OCI’s deployment plans suggest that by late 2026, enterprise developers building agentic applications may have access to Vera-powered infrastructure as easily as spinning up a virtual machine today.

For the broader AI ecosystem, Vera represents a bet that agents are not a passing trend but the dominant paradigm for AI deployment. If NVIDIA is right — and the industry’s response suggests they may be — we’re witnessing the birth of a new category of compute infrastructure. This hardware evolution parallels what Anthropic’s Managed Agents platform is doing at the software layer — building purpose-built infrastructure for sustained autonomous workloads.

The age of agentic AI has a purpose-built CPU. Its name is Vera.


Sources: NVIDIA Blog, Oracle Cloud Infrastructure, NVIDIA Vera CPU