New platform provides a Kubernetes-native foundation for running AI workloads on NVIDIA AI infrastructure, combining advanced isolation, dynamic scaling, and hybrid networking.
(KubeCon + CloudNativeCon North America 2025, Booth #421) — vCluster Labs, the company pioneering Kubernetes virtualization, today announced its Infrastructure Tenancy Platform for AI to help organizations build and operate high-performance AI infrastructure on GPU-focused compute clusters, including support for NVIDIA DGX systems.
The company’s new Reference Architecture for NVIDIA DGX systems is now available, offering architectural guidance for building secure, scalable Kubernetes environments optimized for NVIDIA AI infrastructure. Alongside this, vCluster introduced several new technologies, including vCluster Private Nodes, vCluster VPN, the Karpenter-based vCluster Auto Nodes feature, and direct integrations with NVIDIA Base Command Manager, KubeVirt, and the network isolation controller Netris, all of which form the foundation of the vCluster Infrastructure Tenancy Platform for AI, a unified framework for deploying and managing AI workloads on AI supercomputers in the private cloud as well as on top of hyperscalers and emerging neoclouds.
“Our mission is to make AI infrastructure as dynamic and efficient as the workloads it supports,” said Lukas Gentele, CEO of vCluster. “With our Infrastructure Tenancy Platform for AI, organizations running NVIDIA AI infrastructure can operate secure, elastic Kubernetes environments anywhere, with the performance, control, and efficiency that AI-scale workloads demand. It feels like getting the most cutting edge public cloud managed Kubernetes but on your bare metal AI supercomputer.”
Building Blocks for the AI Infrastructure Era
As enterprises race to operationalize AI at scale, platform teams need a Kubernetes foundation that can manage GPU resources efficiently while ensuring workload isolation, mobility, and security. The Infrastructure Tenancy Platform for AI addresses these challenges through the following key innovations:
- vCluster Private Nodes & Auto Nodes – Enable virtual clusters to dynamically autoscale GPU and CPU capacity across clouds, data centers, and bare metal environments using Karpenter-based automation. These features help maximize GPU utilization while maintaining full isolation and flexibility.
- vCluster VPN – A Tailscale-powered overlay network that establishes secure communication between control planes and worker nodes across hybrid infrastructure. vCluster VPN simplifies burst-to-cloud scenarios, where GPU clusters seamlessly extend from on-premises NVIDIA DGX systems to public cloud environments.
- NVIDIA Base Command Manager Integration – Integrates vCluster with NVIDIA Base Command Manager to bring Auto Nodes to NVIDIA DGX clusters, enabling elasticity, GPU lifecycle management, and efficient scaling across on-prem NVIDIA infrastructure.
- KubeVirt Integration – Enables the creation of virtual machines on demand as nodes within a virtual cluster using KubeVirt, allowing large bare-metal servers to be partitioned into smaller, isolated compute units. This extends Auto Nodes to on-prem and bare-metal environments, giving platform teams elastic, tenant-aware GPU infrastructure under Kubernetes.
- Netris Integration – Provides automated network isolation and lifecycle management for virtual clusters, giving each tenant its own dedicated network path and enabling multi-tenant GPU environments to run securely on shared infrastructure.
- vNode Runtime – A secure, Kubernetes-native container sandbox that helps prevent container break-outs, enabling multi-tenant GPU workloads without reverting to VMs.
Together, these technologies create the foundation of the vCluster Infrastructure Tenancy Platform for AI - a composable, Kubernetes-native framework purpose-built for running AI, ML, and GPU-intensive workloads anywhere.
Industry analysts are increasingly highlighting the urgency of optimizing GPU utilization and simplifying AI infrastructure management.
“As AI infrastructure becomes the new competitive frontier, organizations are under immense pressure to operationalize GPUs efficiently while maintaining security and governance across hybrid environments,” stated Paul Nashawaty, Practice Lead and Principal Analyst at theCUBE Research. “We find that 71% of enterprises cite GPU utilization inefficiency as a major barrier to scaling AI workloads, and nearly two-thirds are exploring Kubernetes-native approaches to unify AI operations across cloud and on-prem. vCluster Labs’ Infrastructure Tenancy Platform for AI directly addresses this gap by enabling dynamic, multi-tenant GPU orchestration with the same elasticity and control enterprises expect from the public cloud, now extended to private NVIDIA-powered AI systems.”
vCluster Reference Architecture for NVIDIA DGX Systems
The new vCluster Reference Architecture for NVIDIA DGX systems outlines best practices for deploying virtual clusters on GPU-centric systems, enabling enterprises to deliver a cloud-like Kubernetes experience on-premises. With vCluster, teams can create lightweight virtual clusters that autoscale GPU resources, integrate securely with both on-prem and cloud networks, and maintain consistent performance across environments.
“We’ve been using vCluster for a while and we love the technology,” said Nick Jones, VP of Engineering at Nscale. “We’re using vCluster to optimise GPU utilisation and accelerate Kubernetes cluster provisioning — delivering higher performance and efficiency that directly benefit our customers.”
Enabling Cloud Agility for NVIDIA GPU Infrastructure
From AI factories to private GPU clouds, vCluster brings the scalability and efficiency of public cloud Kubernetes to NVIDIA environments.
Organizations using vCluster report:
- Faster cluster provisioning – virtual clusters spin up in seconds with fully declarative provisioning via Terraform and GitOps
- Higher GPU utilization – fewer idle GPUs across teams and tenants while ensuring fair use for everyone across the organization
- Simplified day 2 operations – automated control plane and node upgrades, automatic backups with vCluster Snapshots and standardized guidance for integration into common cloud-native observability stacks
Experience vCluster at KubeCon North America
Be among the first to experience the vCluster Infrastructure Tenancy Platform for AI at KubeCon + CloudNativeCon North America 2025 in Atlanta. Visit Booth #421 for live demos, technical sessions, and book signings.
vCluster is also a Diamond Sponsor of Cloud Native + Kubernetes AI Day, where company leaders will present live sessions on GPU-accelerated Kubernetes operations, followed by a fireside chat featuring speakers from NVIDIA, JPMorgan Chase, and vCluster on “The Future of AI and Kubernetes.”
Resources
About vCluster
vCluster Labs is virtualizing Kubernetes to enable advanced tenancy models that increase utilization, reduce costs, and make Kubernetes more dynamic. vCluster allows platform and infrastructure teams to create virtual Kubernetes clusters that are as scalable and isolated as traditional clusters but far more lightweight and flexible. Trusted by companies like Nscale, Deloitte, Niantic, and Aussie Broadband, vCluster powers fully isolated tenant environments across public cloud, private data centers, and GPU-powered AI infrastructure. To learn more, visit www.vcluster.com
View source version on businesswire.com: https://www.businesswire.com/news/home/20251110732393/en/
The company’s new Reference Architecture for NVIDIA DGX systems is now available, offering architectural guidance for building secure, scalable Kubernetes environments optimized for NVIDIA AI infrastructure.
Contacts
Media Contact:
Heather Fitzsimmons
heather@mindsharepr.com
650-279-4360