Livestreams

Infrastructure for AI: Managing GPU Workloads with Kubernetes

Jan 19, 2026

4:00 pm

–

Jan 19, 2026

4:00 pm

Online

Watch replay

Learn more

Saiyam Pathak

Principal Developer Advocate at vCluster

The AI revolution has made GPUs the most valuable resource in modern infrastructure — and the hardest to manage at scale.

Kubernetes is becoming the platform of choice for AI workloads, but production-grade GenAI comes with real challenges: GPU scarcity, poor utilization, multi-team access, and fragmented deployments across cloud and bare metal.

Saiyam Pathak, Head of Developer Relations at vCluster, is speaking at O'Reilly's Infrastructure & Ops Superstream: Infrastructure for AI on January 20, 2025.

In this session, he'll cover:

GPU sharing strategies (time-slicing, MIG)
Advanced scheduling with Kai Scheduler
Secure multi-tenancy using vCluster
How vLLM fits into real architectures for scalable inference

You'll walk away with a practical blueprint for delivering high-performance, cost-efficient, multi-tenant AI infrastructure across Kubernetes environments.

Ready to take vCluster for a spin?

Deploy your first virtual cluster today.

Get Started

Infrastructure for AI: Managing GPU Workloads with Kubernetes

Sign up for the event