Architecting a Private Cloud for AI Workloads
How to design, build, and operate a cost-effective private cloud infrastructure for enterprise AI at scale
Public clouds are convenient for AI experimentation, but production workloads often hit walls. For enterprises running continuous training and inference, a private cloud can deliver better ROI, data sovereignty, and performance. This comprehensive guide walks through architecting a private cloud for AI workloads from the ground up.
GPU Multitenancy in Kubernetes: Strategies, Challenges, and Best Practices
How to safely share expensive GPU infrastructure across teams without sacrificing performance or security
GPUs don't support native sharing between isolated processes. Learn four approaches for running multitenant GPU workloads at scale without performance hits.
AI Infrastructure Isn’t Limited By GPUs. It’s Limited By Multi-Tenancy.
What the AI Infrastructure 2025 Survey Reveals, And How Platform Teams Can Respond
The latest AI Infrastructure 2025 survey shows that most organizations are struggling not due to GPU scarcity, but because of poor GPU utilization caused by limited multi-tenancy capabilities. Learn how virtual clusters and virtual nodes help platform teams solve high costs, sharing issues, and low operational maturity in Kubernetes environments.
KubeCon + CloudNativeCon North America 2025 Recap
Announcing the Infrastructure Tenancy Platform for NVIDIA DGX—plus what we learned from 100+ conversations at KubeCon about GPU efficiency, isolation, and the future of AI on Kubernetes.
KubeCon Atlanta 2025 was packed with energy, launches, and conversations that shaped the future of AI infrastructure. At Booth #421, we officially launched the Infrastructure Tenancy Platform for NVIDIA DGX—a Kubernetes-native platform designed to maximize GPU efficiency across private AI supercomputers, hyperscalers, and neoclouds. Here's what happened, what we announced, and why it matters for teams scaling AI workloads.
Scaling Without Limits: The What, Why, and How of Cloud Bursting
A practical guide to implementing cloud bursting using vCluster VPN, Private Nodes, and Auto Nodes for secure, elastic, multi-cloud scalability.
Cloud bursting lets you expand compute capacity on demand without overprovisioning or re-architecting your systems. In this guide, we break down how vCluster VPN connects Private and Auto Nodes securely across environments—so you can scale beyond limits while keeping costs and complexity in check.
vCluster and Netris Partner to Bring Cloud-Grade Kubernetes to AI Factories & GPU Clouds With Strong Network Isolation Requirements
vCluster Labs and Netris team up to bring cloud-grade Kubernetes automation and network-level multi-tenancy to AI factories and GPU-powered infrastructure.
vCluster Labs has partnered with Netris to revolutionize how AI operators run Kubernetes on GPU infrastructure. By combining vCluster’s Kubernetes-level isolation with Netris’s network automation, the integration delivers a full-stack multi-tenancy solution, simplifying GPU cloud operations, maximizing utilization, and enabling cloud-grade performance anywhere AI runs.
Recapping The Future of Kubernetes Tenancy Launch Series
How vCluster’s Private Nodes, Auto Nodes, and Standalone releases redefine multi-tenancy for modern Kubernetes platforms.
From hardware-isolated clusters to dynamic autoscaling and fully standalone control planes, vCluster’s latest launch series completes the future of Kubernetes multi-tenancy. Discover how Private Nodes, Auto Nodes, and Standalone unlock new levels of performance, security, and flexibility for platform teams worldwide.
Bootstrapping Kubernetes from Scratch with vCluster Standalone: An End-to-End Walkthrough
Bootstrapping Kubernetes from scratch, no host cluster, no external dependencies.
Kubernetes multi-tenancy just got simpler. With vCluster Standalone, you can bootstrap a full Kubernetes control plane directly on bare metal or VMs, no host cluster required. This walkthrough shows how to install, join worker nodes, and run virtual clusters on a single lightweight foundation, reducing vendor dependencies and setup complexity for platform and infrastructure teams.
GPU on Kubernetes: Safe Upgrades, Flexible Multitenancy
How vCluster and NVIDIA’s KAI Scheduler reshape GPU workload management in Kubernetes - enabling isolation, safety, and maximum utilization.
GPU workloads have become the backbone of modern AI infrastructure, but managing and upgrading GPU schedulers in Kubernetes remains risky and complex.
This post explores how vCluster and NVIDIA’s KAI Scheduler together enable fractional GPU allocation, isolated scheduler testing, and multi-team autonomy, helping organizations innovate faster while keeping production safe.
A New Foundation for Multi-Tenancy: Introducing vCluster Standalone
Eliminating the “Cluster 1 problem” with vCluster Standalone v0.29 – the unified foundation for Kubernetes multi-tenancy on bare metal, VMs, and cloud.
vCluster Standalone changes the Kubernetes tenancy spectrum by removing the need for external host clusters. With direct bare metal and VM bootstrapping, teams gain full control, stronger isolation, and vendor-supported simplicity. Explore how vCluster Standalone (v0.29) solves the “Cluster 1 problem” while supporting Shared, Private, and Auto Nodes for any workload.
Introducing vCluster Auto Nodes — Practical deep dive
Auto Nodes extend Private Nodes with provider-agnostic, automated node provisioning and scaling across clouds, on-prem, and bare metal.
Kubernetes makes pods elastic, but node scaling often breaks outside managed clouds. With vCluster Platform 4.4 + v0.28, Auto Nodes fix that gap, combining isolation, elasticity, and portability. Learn how Auto Nodes extend Private Nodes with automated provisioning and dynamic scaling across any environment.
Introducing vCluster Auto Nodes: Karpenter-Based Dynamic Autoscaling Anywhere
Dynamic, isolated, and cloud-agnostic autoscaling for every virtual cluster.
vCluster Auto Nodes brings dynamic, Karpenter-powered autoscaling to any environment, public cloud, private cloud, or bare metal. Combined with Private Nodes, it delivers true isolation and elasticity for Kubernetes, letting every virtual cluster scale independently without cloud-specific limits.
How vCluster Auto Nodes Delivers Dynamic Kubernetes Scaling Across Any Infrastructure
Kubernetes pods scale elastically, but node scaling often stops at the provider boundary. Auto Nodes extend Private Nodes to bring elasticity and portability to isolated clusters across clouds, private datacenters, and bare metal.
Pods autoscale in Kubernetes, but nodes don’t. Outside managed services, teams fall back on brittle scripts or costly overprovisioning. With vCluster Platform 4.4 + vCluster v0.28, Auto Nodes close the gap, bringing automated provisioning and elastic scaling to isolated clusters across clouds, private datacenters, and bare metal.
The Case for Portable Autoscaling
Kubernetes has pods and deployments covered, but when it comes to nodes, scaling breaks down across clouds, providers, and private infrastructure. Auto Nodes change that.
Kubernetes makes workloads elastic until you hit the node layer. Managed services offer partial fixes, but hybrid and isolated environments still face scaling gaps and wasted resources. vCluster Auto Nodes close this gap by combining isolation, just-in-time elasticity, and environment-agnostic portability.
Running Dedicated Clusters with vCluster: A Technical Deep Dive into Private Nodes
A technical walkthrough of Private Nodes in vCluster v0.27 and how they enable true single-tenant Kubernetes clusters.
Private Nodes in vCluster v0.27 take Kubernetes multi-tenancy to the next level by enabling fully isolated, dedicated clusters. In this deep dive, we walk through setup, benefits, and gotchas, from creating a vCluster with Private Nodes to joining worker nodes and deploying workloads. If you need stronger isolation, simpler lifecycle management, or enterprise-grade security, this guide covers how Private Nodes transform vCluster into a powerful single-tenant option without losing the flexibility of virtual clusters.
We’re Now vCluster Labs
A new name, the same mission, building the best Kubernetes tenancy tools for teams everywhere.
Loft Labs is now vCluster Labs, a name that reflects our focus on building the best Kubernetes multi-tenancy and infrastructure engineering tools. The same team, projects, and mission remain, but with a clearer brand aligned to our product, vCluster.
vCluster v0.27: Introducing Private Nodes for Dedicated Clusters
Dedicated, tenant‑owned nodes with a managed control plane, full isolation without running separate clusters.
Private Nodes complete vCluster’s tenancy spectrum: tenants connect their own nodes to a centrally managed control plane for full isolation, custom runtimes (CRI/CNI/CSI), and consistent performance, ideal for AI/ML, HPC, and regulated environments. Learn how it works and what’s next with Auto Nodes.
How to Scale Kubernetes Without etcd Sharding
Rethinking Kubernetes scale: avoid the risks of etcd sharding with virtual clusters built for performance, stability, and multi-tenant environments.
Is your Kubernetes cluster slowing down under load? etcd doesn’t scale well with multi-tenancy or 30k+ objects. This blog shows how virtual clusters offer an easier, safer way to isolate tenants and scale your control plane, no sharding required.
Three Tenancy Modes, One Platform: Rethinking Flexibility in Kubernetes Multi-Tenancy
Why covering the full Kubernetes tenancy spectrum is critical, and how Private Nodes bring stronger isolation to vCluster
In this blog, we explore why covering the full Kubernetes tenancy spectrum is essential, and how vCluster’s upcoming Private Nodes feature introduces stronger isolation for teams running production, regulated, or multi-tenant environments without giving up Kubernetes-native workflows.
Scaling Kubernetes Without the Pain of etcd Sharding
Why sharding etcd doesn’t scale, and how virtual clusters eliminate control plane bottlenecks in large Kubernetes environments.
OpenAI’s outage revealed what happens when etcd breaks at scale. This post explains why sharding isn’t enough, and how vCluster offloads API load with virtual control planes. Benchmark included.
vCluster: The Performance Paradox – How Virtual Clusters Save Millions Without Sacrificing Speed
How vCluster Balances Kubernetes Cost Reduction With Real-World Performance
Can you really save millions on Kubernetes infrastructure without compromising performance? Yes, with vCluster. In this blog, we break down how virtual clusters reduce control plane overhead, unlock higher node utilization, and simplify multi-tenancy, all while maintaining lightning-fast performance.
5 Must-See KubeCon + CloudNativeCon India 2025 Sessions
A curated list of impactful, technical, and thought-provoking sessions to catch at KubeCon + CloudNativeCon India 2025 in Hyderabad.
KubeCon + CloudNativeCon India 2025 is back in Hyderabad on August 6–7! With so many exciting sessions, it can be hard to choose. Here are 5 standout talks you shouldn't miss, from real-world Kubernetes meltdowns to scaling GitOps at Expedia, and even why Kubernetes is moving to NFTables.
Solving Kubernetes Multi-tenancy Challenges with vCluster
Unlocking Secure and Scalable Multi-Tenancy in Kubernetes with Virtual Clusters
Running multiple tenants on a single Kubernetes cluster can be complex and risky. In this post, Liquid Reply explores how vCluster offers a secure and cost-efficient solution by isolating workloads through lightweight virtual clusters.
NVIDIAScape: How vNode prevents this container breakout without the need for VMs
Container breakouts on GPU nodes are real, and just three lines of code can be enough. Discover how vNode neutralizes vulnerabilities like NVIDIAScape without relying on VMs.
NVIDIAScape (CVE-2025-23266) is a critical GPU-related vulnerability that allows attackers to break out of containers and gain root access. While some respond by layering in virtual machines, this blog walks through a better approach, how vNode uses container-native sandboxing to neutralize such attacks at the kernel level without sacrificing performance. Includes a step-by-step replication of the exploit, and a demo of how vNode prevents it.
Building and Testing Kubernetes Controllers: Why Shared Clusters Break Down
How shared clusters fall short, and why virtual clusters are the future of controller development.
Shared clusters are cost-effective, but when it comes to building and testing Kubernetes controllers, they create bottlenecks, from CRD conflicts to governance issues. This blog breaks down the trade-offs between shared, local, and dedicated clusters and introduces virtual clusters as the scalable solution for platform teams.
What Is GPU Sharing in Kubernetes?
How Kubernetes can make GPU usage more efficient for AI/ML teams through MPS, MIG, and smart scheduling.
As AI and ML workloads scale rapidly, GPUs have become essential, and expensive resources. But most teams underutilize them. This blog dives into how GPU sharing in Kubernetes can help platform teams increase efficiency, cut costs, and better support AI infrastructure.
Smarter Infrastructure for AI: Why Multi-Tenancy is a Climate Imperative
How virtual clusters and smarter tenancy models can reduce carbon impact while scaling AI workloads.
AI’s rapid growth is fueling a silent climate problem: idle infrastructure. This blog explores why multi-tenancy is key to scaling AI sustainably and how vCluster helps teams reduce waste while moving faster.
Automating Kubernetes Cleanup in CI Workflows
Keep your CI pipelines clean and efficient by automating Kubernetes resource cleanup with vCluster and Loft.
Leftover Kubernetes resources from CI jobs can drive up cloud costs and clutter your clusters. This guide shows how to automate cleanup tasks using vCluster, helping you maintain cleaner, faster CI/CD pipelines.
Automating Kubernetes Cleanup in CI Workflows
Keep your CI pipelines clean and efficient by automating Kubernetes resource cleanup with vCluster and Loft.
Leftover Kubernetes resources from CI jobs can drive up cloud costs and clutter your clusters. This guide shows how to automate cleanup tasks using vCluster, helping you maintain cleaner, faster CI/CD pipelines.
Bare Metal Kubernetes with GPU: Challenges and Multi-Tenancy Solutions
Why Namespace Isolation Falls Short for GPU Workloads, and How Multi-Tenancy with vCluster Solves It
Managing AI workloads on bare metal Kubernetes with GPUs presents unique challenges, from weak namespace isolation to underutilized resources and operational overhead. This blog explores the pitfalls of namespace-based multi-tenancy, why running a separate cluster per team is expensive, and how vCluster enables secure, efficient, and autonomous GPU sharing for AI teams.