Inside Vera Rubin Ultra: Liquid-Cooled Racks for the Next Generation of AI Factories

Read Editorial Disclaimer

Disclaimer: Perspectives here reflect AI-POV and AI-assisted analysis, not any specific human author. Read full disclaimer — issues: report@theaipov.news

By Tech Desk | March 16, 2026 | 6 min read AI-Assisted | Source: YouTube / Nvidia

In Nvidia’s latest GTC keynote, the Vera Rubin platform moved from slideware to hardware that you can roll onto a stage. The system is almost unrecognisable compared with the cabled racks of earlier GPU clusters. It is 100% liquid cooled, the cable jungle is gone, and what once took two days to install can now be done in roughly two hours. That change is not cosmetic. Shorter manufacturing and deployment cycle times translate directly into faster AI factory build-outs and quicker access to revenue-generating compute.

The Vera Rubin system is also designed to be cooled by hot water at around 45 degrees. Instead of data centres burning energy to chill air and then blowing it past racks, much of the cooling burden is pushed into the rack itself. Hot-water cooling reduces the load on facility HVAC systems, cuts operating costs and frees more of the site’s power budget for the AI factory rather than the building infrastructure around it.

NVLink as a sixth-generation scale-up fabric

At the heart of Rubin Ultra is something Jensen Huang calls the “secret sauce”: the sixth-generation NVLink scale-up switching system. This is neither Ethernet nor InfiniBand; it is Nvidia’s own high-bandwidth, low-latency interconnect designed specifically for GPU-to-GPU communication. Huang is blunt about its difficulty: building such a fabric at this scale is “insanely hard to do well” and “insanely hard to do at all”.

The latest NVLink generation is itself fully liquid cooled. Switching elements are integrated into the rack, with cooling loops designed alongside the compute nodes. The goal is to turn an entire rack — and, at Rubin Ultra scale, an entire row of racks — into a single coherent performance domain. For long-context, agentic AI workloads that need thousands of GPUs to act like one machine, that coherence is the difference between theoretical FLOPS and real token throughput.

Groq LP300, Spectrum-X and co-packaged optics

Alongside Rubin, Nvidia is also showing off new building blocks that sit around the GPU core. One is a Groq system based on the brand-new LP300, an accelerator described as something “the world has never seen before” and already in volume production. Another is the world’s first CPO Spectrum-X switch, which uses co-packaged optics (CPO) so that optics sit directly on the chip and interface with silicon without long electrical runs.

In a CPO design, electrons are converted to photons at the package, and fibre connects directly into the switch silicon. Nvidia co-developed this process technology with TSMC and, as Huang emphasises, is currently the only company in production with it. The idea is to push bandwidth and efficiency higher than is possible when optical modules live on separate pluggable transceivers. For AI factories, that means more traffic per rack, less power lost in electrical links and simpler high-density cabling at the top of rack.

Vera CPUs and BlueField-4 STX storage

The Vera Rubin platform is not only about GPUs. It also introduces the Vera CPU, a processor Huang claims delivers twice the performance per watt of any other CPU on the market today. Nvidia initially expected to sell CPUs mainly as part of GPU systems, but demand has turned them into a standalone multi-billion-dollar business line. The message is clear: orchestration, preprocessing and control-plane work are now critical enough that CPU efficiency matters almost as much as GPU throughput.

Rubin Ultra racks also integrate BlueField-4 STX, Nvidia’s new storage platform. By putting DPUs directly in the storage path, BlueField can handle data movement, security and offload tasks without burdening GPUs or general-purpose CPUs. In AI factories where input and output tokens are constantly streaming, that kind of fast, programmable storage fabric is essential to keep accelerators fed.

Kyber racks and the Rubin Ultra domain

The most visually striking part of Rubin Ultra is the new rack design, code-named Kyber. Traditional racks are front-loaded with servers and backed by bundles of copper and fibre cables. Kyber is different. Compute nodes slide vertically into the front of the rack; at the centre is a midplane with four high-density NVLink connectors per node. When a node is inserted, those connectors mate with the midplane, creating a rigid, structured interconnect with no manual cabling between nodes.

On the back of the midplane sit the NVLink switches, mounted vertically. Compute nodes in the front, NVLink fabric in the back: together they connect 144 GPUs into one NVLink domain. Huang calls this configuration Rubin Ultra. From the software’s point of view, each Kyber rack becomes a single giant computer. Multiple racks then link together into larger compute clusters, but the basic abstraction is already an AI factory at the rack scale.

Because connections are made through the midplane rather than loose cables, Kyber also simplifies installation and service. The heaviest part of the rack is the NVLink section itself, which Huang jokes seems to get heavier every year as more capability is packed inside. But for operators, the trade-off is worthwhile: structured cabling and vertical insertion make it faster to deploy and replace nodes, and easier to reason about airflow and coolant paths.

Applying the same ideas to Ethernet racks

Nvidia is also taking the design lessons from Kyber and applying them to Ethernet-based systems. One demo rack in the keynote contains 256 liquid-cooled nodes in a single rack, connected with the same kind of high-density connectors and structured cabling used in NVLink systems. The idea is that whether a customer chooses NVLink-based scale-up or Ethernet-based scale-out, they get the same factory-friendly installation, serviceability and power-density story.

In practice, that means less time pulling cables, fewer opportunities for human error and a clearer path to scaling AI factories from a handful of racks to hundreds. It also aligns with Nvidia’s broader Spectrum-X strategy: specialised Ethernet fabrics tuned for AI traffic patterns, dropped into racks that have already been optimised for liquid cooling and high-density node layouts.

Why Vera Rubin Ultra matters for AI factories

Stepping back, Vera Rubin Ultra is Nvidia’s answer to a simple but brutal constraint: power. Every large AI data centre is power-limited. Within that fixed budget, the job of an AI factory is to maximise throughput (total tokens produced) and token speed (how fast those tokens can be generated) at a given power level. Liquid-cooled Kyber racks, NVLink 6, CPO Spectrum-X switches, Vera CPUs and BlueField-4 STX are all pieces of a single optimisation problem.

If Rubin Ultra can deliver more tokens per second per megawatt than previous architectures — while also being faster to manufacture, install and service — it gives operators a way to stretch scarce power budgets further. That, in turn, determines which companies can afford to offer faster models, longer context windows and richer agentic workflows. In Huang’s telling, every CEO running AI infrastructure will need to understand these racks, because they are the machines that turn data and electricity into tomorrow’s intelligence.

Sources

Nvidia GTC keynote demonstrations of Vera Rubin, Rubin Ultra, Kyber racks and liquid-cooled NVLink domains
Nvidia materials on CPO Spectrum-X switches, co-packaged optics and BlueField-4 STX storage platforms
Industry analysis of hot-water cooling, rack-scale design and power-constrained AI factories

Related Video

Related video — Watch on YouTube

Read More News

How To Build A Legal RAG App In Weaviate

AI YouTube Clones Are Turning Professor Jiang’s Viral Rise Into A Conspiracy Machine

The Iran Ceasefire Is Turning Into A Maritime Pressure Campaign

China’s Taiwan Carrot Still Depends On Military Pressure

Putin’s Easter Ceasefire Shows Why Russia Still Controls The Timing

OpenAI’s Cyber Defense Push Shows GPT-5.4 Is Arriving With Guardrails

Meta’s Muse Spark Makes Subagents The New Face Of Meta AI

Your Fingerprints Are Now Europe’s First Gatekeeper: How a Digital Border Quietly Seized Unprecedented Control

Meloni’s Crime Wave Panic: A January Stabbing Becomes April’s Political Opportunity

Germany’s Noon Price Cap Is Economic Surrender Dressed as Policy Innovation

Germany’s Quiet Healthcare Revolution: How Free Lung Cancer Screening Reveals What’s Really Broken

France’s Buried Confession: Why Naming America as an Election Threat Really Means

The State as Digital Parent: Why the UK’s Teen Social Media Ban Is Actually Totalitarian

Starmer’s Crypto Ban Is Political Theater Hiding a Completely Different Story

Spain’s €5 Billion Emergency Response Will Delay Economic Pain, Not Prevent It

The Spanish Soldier Detention Reveals the EU’s Fractured Israel Strategy

Anthropic’s Mythos Reveals the Truth: AI Labs Now Possess Models That Exceed Human Capability

Polymarket’s Pattern of Suspiciously Timed Bets Reveals Systemic Information Asymmetry

Beyond Nostalgia: How Japan’s Article 9 Debate Reveals a Civilization Under Existential Pressure

Japan’s Oil Panic Exposes the Myth of Wealthy Nation Invulnerability

Brazil’s 2026 Rematch: The Election That Will Determine If Latin America Surrenders to the Left

Brazil’s Lithium Trap: How the Energy Transition Boom Could Destroy the Region’s Future

Australia’s Iran Refusal: A Sovereign Challenge to American Hegemony That Will Cost It Dearly

Artemis II’s Historic Return: The Moon Mission That Should Be Celebrated but Reveals Space’s True Purpose

Why the Netherlands’ Tesla FSD Approval Is a Regulatory Trap for Europe

The Dutch Government’s Shareholder Revolt Could Reshape Executive Compensation Across Europe

Poland’s Economic Success Cannot Prevent the Rise of Polexit and European Fragmentation

The Poland-South Korea Defense Partnership Is Quietly Reshaping European Security Architecture

North Korea’s Missile Tests Are Reactive—The Real Escalation Is Seoul’s Preemption Strategy

Samsung’s Record Earnings Are Real, But the Profits Vanish When You Understand the Costs

Turkey’s Radical Tobacco Ban Could Kill an Industry—But First It Will Consolidate Power

Turkey’s Balancing Act Is Breaking: Fitch Downgrade Reveals Currency Collapse Risk

Milei’s Libertarian Experiment Is Unraveling: Approval Hits Historic Low

Mexico’s Last Fossil Fuel Bet: Saguaro LNG Would Transform Mexico’s Energy Future—If It Survives Politics

Mexico’s World Cup Dream Meets Security Nightmare: 100,000 Troops Cannot Prevent Cartel War Bloodshed