GPUs in Kubernetes for AI Workloads

  Рет қаралды 2,264

DevOps Toolkit

Күн бұрын

Today we dive into running AI models on Kubernetes with GPU support. Learn how to manage GPUs in Kubernetes clusters, create GPU nodes, and optimize resource usage without breaking the bank. We'll walk you through setting up a Google Cloud Kubernetes cluster (the same logic should apply to other Cloud providers), deploying AI models like Ollama's Llama2, and handling GPU partitioning. Watch now to master GPU-based AI workloads in Kubernetes!
#Kubernetes #GPU #AI
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
Sponsor: CAST AI
🔗 cast.ai
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
Consider joining the channel: kzbin.infojoin
▬▬▬▬▬▬ 🔗 Additional Info 🔗 ▬▬▬▬▬▬
➡ Transcript and commands: devopstoolkit.live/ai/unlock-the-power-of-gpus-in-kubernetes-for-ai-workloads
▬▬▬▬▬▬ 💰 Sponsorships 💰 ▬▬▬▬▬▬
If you are interested in sponsoring this channel, please visit devopstoolkit.live/sponsor for more information. Alternatively, feel free to contact me over Twitter or LinkedIn (see below).
▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬
➡ Twitter: vfarcic
➡ LinkedIn: www.linkedin.com/in/viktorfarcic/
▬▬▬▬▬▬ 🚀 Other Channels 🚀 ▬▬▬▬▬▬
🎤 Podcast: www.devopsparadox.com/
💬 Live streams: kzbin.info
▬▬▬▬▬▬ ⏱ Timecodes ⏱ ▬▬▬▬▬▬
00:00 AI Inference with GPUs
01:30 CAST AI (sponsor)
02:29 Using GPUs for AI Inference in Kubernetes

Пікірлер: 22
@thibdub2752
@thibdub2752 Күн бұрын
Hello Viktor, thank for the video, I would like to see a video with Knative and KubeVirt.
@maaft
@maaft 12 сағат бұрын
Hi Viktor, glad that you mentioned KubeVirt. I did not find a solution yet where one can have on-premise (or cloud-provider A) GPU Servers to cover the base-load, while at the same time scaling into another cloud-provider B when there is demand. I coldnt get it to work yet and tried different solutions (Admiral, KubeVirt, Karmada) but there was always one or more roadblockers. Most of the time, the scheduler would just not even try to schedule my workload since all GPUs are already used. But IF the scheduler would just go ahead and schedule, auto-scaling would have picked up and spawn a new GPU node. This topic could also be expanded on the general case of how to do multi-cluster workload distribution (with auto-scaling) As always, thanks so much for your valuable content!
@mgroth3973
@mgroth3973 Күн бұрын
Great content as always! As someone who works a lot with KServe, I would of course like to see a video clip about your preferred approach of scaling InferenceServices in prod.
@dariusjuodokas9458
@dariusjuodokas9458 12 сағат бұрын
yes, yes, please make a video about models. Also it would be nice to have a concise "Intro" into our options (tools) to train and/or fine-tune our own self-hosted models (or managed?). I want to dip my fingers into ML, but the technology is so fast growing and changing, that it's quite difficult to get 3 consistent search results (videos/blogs) explaining what is what and how to DIY. And ML seems to be the way to go further, as it seems to be rather good at automation w/o the need to write custom software., so we DO have to master it
@Sebastian-or4xw
@Sebastian-or4xw 21 сағат бұрын
Just FYI: a single Ollama installation and instance can run multiple models. Of course after each other when one is no longer used, but if you have enough VRAM also at the same time now
@DevOpsToolkit
@DevOpsToolkit 20 сағат бұрын
That's true. I installed it twice only to demonstrate how sharing gpu works.
@mrgdevops
@mrgdevops 20 сағат бұрын
not just ollama... differentiate bwn few llms .... helpful in devops space
@zuduni
@zuduni 22 сағат бұрын
Kubevirt with gpu will be cool, thx
@nemethk
@nemethk Күн бұрын
Pozdrav! A video of Ollama AI models in your presentation would be a valuable content.
@DevOpsToolkit
@DevOpsToolkit Күн бұрын
Great. I'll add it to my to-do list.
@ovsepavakian6109
@ovsepavakian6109 23 сағат бұрын
Also you can use Slurm Workload Manager or Volcano
@theindependentradio
@theindependentradio 8 сағат бұрын
Yes please show the knative option
@Fayaz-Rehman
@Fayaz-Rehman Күн бұрын
Thank you sharing - Could it be possible if I can apply the same locally on my homelab ?
@DevOpsToolkit
@DevOpsToolkit Күн бұрын
If you have gpu in your homeland, yes you can. Th setup will be more complicated though.
@TiggyProph
@TiggyProph Күн бұрын
What are your recommended tools to manage GPU workloads on Kubernetes? At my org, we've configured the basics that you have here already, and now the teams are looking into AI frameworks. (Using Argo, Karpenter, and EKS to manage all the configurations discussed in your video here) Applications like Kubeflow are being discussed to help those teams move more swiftly, and I'm curious about your take on it or if you have content coming soon related to that.
@DevOpsToolkit
@DevOpsToolkit Күн бұрын
I explored inference. Kubeflow is focused more on generating models. It is great, but sometimes overwhelming. I'll do my best to explore it in one of the upcoming videos.
@Anselm_0
@Anselm_0 Күн бұрын
Yes, an orchestrator definitely makes your life easier and there are pretty good open source ones. Airflow, prefect or flyte are pretty good. From Kubeflow I've heard mixed experiences.
@chandup
@chandup Күн бұрын
zenML could help make life easy to use many ML tools.
@Sebastian-or4xw
@Sebastian-or4xw 21 сағат бұрын
Kubeflow is the ML everything tool (collection), but it is not very easy to deploy and maintain in my experience. We used deployKf which made it a bit easier but that does not include everything
@renanmonteirobarbosa8129
@renanmonteirobarbosa8129 23 сағат бұрын
What if my model requires 32 gpus to perform inference? 😜 what see your K8s do that
@DevOpsToolkit
@DevOpsToolkit 23 сағат бұрын
What would you use instead?
@renanmonteirobarbosa8129
@renanmonteirobarbosa8129 20 сағат бұрын
@@DevOpsToolkit SLURM as reccomended by Nvidia. They have lots of eduational material on the top and recently updated DLI labs
iPhone or Chocolate??
00:16
Hungry FAM
Рет қаралды 55 МЛН
Bike Vs Tricycle Fast Challenge
00:43
Russo
Рет қаралды 110 МЛН
Стойкость Фёдора поразила всех!
00:58
МИНУС БАЛЛ
Рет қаралды 6 МЛН
#ipl #iphone #россия #aquarium
0:12
ТУНИКАФОН - TUNIKAFON
Рет қаралды 2,1 МЛН
How To use Adobe Photoshop cc In photo edit
0:41
Nadeem HD Tech
Рет қаралды 661 М.
The force of electromagnetic eddy currents
0:31
Nikola Toy
Рет қаралды 25 МЛН
iPhone 7
0:13
ARGEN
Рет қаралды 16 МЛН
iPhone 16 Vs S25 ultra💀
1:01
Skinnycomics
Рет қаралды 6 МЛН
Китайцы сделали самый ДЛИННЫЙ ноутбук: Это шутка
0:15
Собиратель новостей
Рет қаралды 646 М.