Podcast Episode

Google Cloud and Nvidia Unveil Fractional GPUs and Deeper AI Partnership at GTC 2026

March 17, 2026

Audio archived. Episodes older than 60 days are removed to save server storage. Story details remain below.

Google Cloud and Nvidia have announced a major expansion of their AI infrastructure partnership at GTC 2026, introducing industry-first fractional GPU virtual machines and confirming plans to offer the next-generation Vera Rubin NVL72 rack-scale system. The announcements also include new software integrations and open-source inference tools designed for enterprise agentic AI workloads.

Fractional GPUs Arrive in the Cloud

Google Cloud and Nvidia have unveiled a sweeping expansion of their AI infrastructure partnership at the Nvidia GTC 2026 conference in San Jose, California. The headline announcement is the preview of fractional G4 virtual machines, which use Nvidia's virtual GPU technology to divide the RTX Pro 6000 Blackwell Server Edition GPU into half, quarter, and eighth slices. This industry first means enterprises can now pay only for the GPU capacity they actually need, from lightweight remote desktops at the smallest slice up to large language model inference and robotics simulation at the half-GPU level.

Vera Rubin on the Horizon

Google Cloud also confirmed it will be among the first cloud providers to offer Nvidia's Vera Rubin NVL72 rack-scale system in the second half of 2026. The platform combines 72 Rubin GPUs and 36 Vera CPUs in a single rack, delivering AI training with one quarter of the GPUs and inference at one tenth the cost per million tokens compared to the current Blackwell platform. Amazon Web Services, Microsoft Azure, and Oracle Cloud Infrastructure will also deploy the system.

Software Stack Gets Smarter

On the software side, the integration of Nvidia Dynamo 1.0 with Google Kubernetes Engine Inference Gateway creates an open-source control plane that helps teams extract more performance from their accelerators. Vertex AI training cluster support has expanded for A4X virtual machine domains running on Nvidia GB200 NVL72, now featuring proactive hardware fault detection to keep multi-week training jobs on track. Vertex AI Model Garden also gains the Nemotron 3 family of open models, including the 120 billion parameter Nemotron 3 Super.

Industry Momentum

Customers including General Motors, ElevenLabs, and Schrodinger are already using the full-size G4 VMs that became generally available in October 2025. Salesforce is leveraging Vertex AI training clusters on Nvidia GB200 NVL72 to power its Agentforce 360 Platform for enterprise AI agents.

Published March 17, 2026 at 4:18am