Skip to content

Inside Vera Rubin Ultra: Liquid-Cooled Racks for the Next Generation of AI Factories

Read Editorial Disclaimer
Disclaimer: Perspectives here reflect AI-POV and AI-assisted analysis, not any specific human author. Read full disclaimer — issues: report@theaipov.news

In Nvidia’s latest GTC keynote, the Vera Rubin platform moved from slideware to hardware that you can roll onto a stage. The system is almost unrecognisable compared with the cabled racks of earlier GPU clusters. It is 100% liquid cooled, the cable jungle is gone, and what once took two days to install can now be done in roughly two hours. That change is not cosmetic. Shorter manufacturing and deployment cycle times translate directly into faster AI factory build-outs and quicker access to revenue-generating compute.

The Vera Rubin system is also designed to be cooled by hot water at around 45 degrees. Instead of data centres burning energy to chill air and then blowing it past racks, much of the cooling burden is pushed into the rack itself. Hot-water cooling reduces the load on facility HVAC systems, cuts operating costs and frees more of the site’s power budget for the AI factory rather than the building infrastructure around it.

NVLink as a sixth-generation scale-up fabric

At the heart of Rubin Ultra is something Jensen Huang calls the “secret sauce”: the sixth-generation NVLink scale-up switching system. This is neither Ethernet nor InfiniBand; it is Nvidia’s own high-bandwidth, low-latency interconnect designed specifically for GPU-to-GPU communication. Huang is blunt about its difficulty: building such a fabric at this scale is “insanely hard to do well” and “insanely hard to do at all”.

The latest NVLink generation is itself fully liquid cooled. Switching elements are integrated into the rack, with cooling loops designed alongside the compute nodes. The goal is to turn an entire rack — and, at Rubin Ultra scale, an entire row of racks — into a single coherent performance domain. For long-context, agentic AI workloads that need thousands of GPUs to act like one machine, that coherence is the difference between theoretical FLOPS and real token throughput.

Groq LP300, Spectrum-X and co-packaged optics

Alongside Rubin, Nvidia is also showing off new building blocks that sit around the GPU core. One is a Groq system based on the brand-new LP300, an accelerator described as something “the world has never seen before” and already in volume production. Another is the world’s first CPO Spectrum-X switch, which uses co-packaged optics (CPO) so that optics sit directly on the chip and interface with silicon without long electrical runs.

In a CPO design, electrons are converted to photons at the package, and fibre connects directly into the switch silicon. Nvidia co-developed this process technology with TSMC and, as Huang emphasises, is currently the only company in production with it. The idea is to push bandwidth and efficiency higher than is possible when optical modules live on separate pluggable transceivers. For AI factories, that means more traffic per rack, less power lost in electrical links and simpler high-density cabling at the top of rack.

Vera CPUs and BlueField-4 STX storage

The Vera Rubin platform is not only about GPUs. It also introduces the Vera CPU, a processor Huang claims delivers twice the performance per watt of any other CPU on the market today. Nvidia initially expected to sell CPUs mainly as part of GPU systems, but demand has turned them into a standalone multi-billion-dollar business line. The message is clear: orchestration, preprocessing and control-plane work are now critical enough that CPU efficiency matters almost as much as GPU throughput.

Rubin Ultra racks also integrate BlueField-4 STX, Nvidia’s new storage platform. By putting DPUs directly in the storage path, BlueField can handle data movement, security and offload tasks without burdening GPUs or general-purpose CPUs. In AI factories where input and output tokens are constantly streaming, that kind of fast, programmable storage fabric is essential to keep accelerators fed.

Kyber racks and the Rubin Ultra domain

The most visually striking part of Rubin Ultra is the new rack design, code-named Kyber. Traditional racks are front-loaded with servers and backed by bundles of copper and fibre cables. Kyber is different. Compute nodes slide vertically into the front of the rack; at the centre is a midplane with four high-density NVLink connectors per node. When a node is inserted, those connectors mate with the midplane, creating a rigid, structured interconnect with no manual cabling between nodes.

On the back of the midplane sit the NVLink switches, mounted vertically. Compute nodes in the front, NVLink fabric in the back: together they connect 144 GPUs into one NVLink domain. Huang calls this configuration Rubin Ultra. From the software’s point of view, each Kyber rack becomes a single giant computer. Multiple racks then link together into larger compute clusters, but the basic abstraction is already an AI factory at the rack scale.

Because connections are made through the midplane rather than loose cables, Kyber also simplifies installation and service. The heaviest part of the rack is the NVLink section itself, which Huang jokes seems to get heavier every year as more capability is packed inside. But for operators, the trade-off is worthwhile: structured cabling and vertical insertion make it faster to deploy and replace nodes, and easier to reason about airflow and coolant paths.

Applying the same ideas to Ethernet racks

Nvidia is also taking the design lessons from Kyber and applying them to Ethernet-based systems. One demo rack in the keynote contains 256 liquid-cooled nodes in a single rack, connected with the same kind of high-density connectors and structured cabling used in NVLink systems. The idea is that whether a customer chooses NVLink-based scale-up or Ethernet-based scale-out, they get the same factory-friendly installation, serviceability and power-density story.

In practice, that means less time pulling cables, fewer opportunities for human error and a clearer path to scaling AI factories from a handful of racks to hundreds. It also aligns with Nvidia’s broader Spectrum-X strategy: specialised Ethernet fabrics tuned for AI traffic patterns, dropped into racks that have already been optimised for liquid cooling and high-density node layouts.

Why Vera Rubin Ultra matters for AI factories

Stepping back, Vera Rubin Ultra is Nvidia’s answer to a simple but brutal constraint: power. Every large AI data centre is power-limited. Within that fixed budget, the job of an AI factory is to maximise throughput (total tokens produced) and token speed (how fast those tokens can be generated) at a given power level. Liquid-cooled Kyber racks, NVLink 6, CPO Spectrum-X switches, Vera CPUs and BlueField-4 STX are all pieces of a single optimisation problem.

If Rubin Ultra can deliver more tokens per second per megawatt than previous architectures — while also being faster to manufacture, install and service — it gives operators a way to stretch scarce power budgets further. That, in turn, determines which companies can afford to offer faster models, longer context windows and richer agentic workflows. In Huang’s telling, every CEO running AI infrastructure will need to understand these racks, because they are the machines that turn data and electricity into tomorrow’s intelligence.

Sources

  • Nvidia GTC keynote demonstrations of Vera Rubin, Rubin Ultra, Kyber racks and liquid-cooled NVLink domains
  • Nvidia materials on CPO Spectrum-X switches, co-packaged optics and BlueField-4 STX storage platforms
  • Industry analysis of hot-water cooling, rack-scale design and power-constrained AI factories

Related Video

Related video — Watch on YouTube
Read More News
Mar 16

New Zealand’s petrol pain is really a subsidy war between drivers and EV buyers

Mar 16

Closing the Kennedy Center is really a warning shot at Washington’s arts class

Mar 16

What the Kennedy Center fight reveals about who really controls U.S. culture funding

Mar 16

Vanity Fair’s Oscar party turns awards night into a celebrity brand marketplace

Mar 16

Copyright lawsuits against OpenAI are really about who owns the language we use

Mar 16

GTC 2026 will reveal how far behind the rest of Big Tech is on AI infrastructure

Mar 16

Nvidia is using GTC 2026 to lock AI developers into its ecosystem for a decade

Mar 16

Trump’s threats over Iranian oil routes signal a larger election-year energy gamble

Mar 16

U.S. voters will feel the Hormuz crisis at the pump long before the battlefield

Mar 16

Why Grace Blackwell and Rubin Multiply Revenue Capacity Across Every Token Tier

Mar 16

How Nvidia and Groq LP300 Plus Dynamo Unlock 35× on the Highest-Value Inference Tier

Mar 16

How Token Pricing Tiers Will Reshape the AI Economy

Mar 16

Inside the AI Token Factory: Why Tokens Became the New Commodity of Computing

Mar 16

From DGX-1 to Rubin: How Nvidia Turned Data Centres into AI Factories

Mar 16

“This Is the Beginning of Something Very, Very Big”: Nvidia’s Jensen Huang on AI-Native Companies

Mar 16

From Retrieval to Generation: How ChatGPT Marked the Start of Nvidia’s Generative AI Era

Mar 16

From Perception to Agentic AI: How Reasoning and Coding Agents Changed the Game

Mar 16

The Inference Inflection Point: Why AI Computing Demand Grew a Million Times in Two Years

Mar 16

Healthcare Enters Its ‘ChatGPT Moment’ on Nvidia’s Accelerated Platform

Mar 16

Inside the Trillion-Dollar Industries Powering Nvidia’s AI Infrastructure Boom

Mar 16

Jensen Huang Explains Why Nvidia Is ‘Vertically Integrated but Horizontally Open’

Mar 16

Nvidia, Palantir and Dell Team Up on Air-Gapped AI Platforms

Mar 16

Nvidia CEO Jensen Huang Maps Out the AI Cloud Future in Live Keynote

Mar 16

Team USA’s Route to the Gold Medal Game Says More About the Field Than the Score

Mar 16

Jessie Buckley and the Oscars Narrative Ireland Wants to Tell

Mar 16

Winter Storm Wisconsin Updates: What We Know So Far

Mar 16

Why Iran Chose This Moment to Escalate the Strait of Hormuz Crisis

Mar 16

What the Oscars 2026 Winners Mean for Streaming Services and Theater Chains

Mar 16

The Last Time Oil Hit $100 During a Middle East Crisis, Recession Followed Within Months

Mar 16

Why Matchday Prep Stories Like Real Sociedad’s Rain Session Get Pushed as News

Mar 16

Trump’s Oil Infrastructure Threat Signals a Shift Away From Diplomatic Containment

Mar 16

Intuit’s Buyback Gambit Shows How AI Panic Is Warping Wall Street

Mar 16

Gas Prices Over $100 Per Barrel Will Force Fed to Choose Between Inflation Control and Economic Growth

Mar 16

Severe Weather Sunday and Monday: What We Know So Far

Mar 16

Why Meteorologists Keep Calling It the ‘Last’ Cold Front