Ollama on Kubernetes: ChatGPT for free!

Рет қаралды 12,740

Күн бұрын

Пікірлер: 31

@oadm_beats 2 ай бұрын

Excelente video! Me encantó cómo explicaste el proceso de implementar Ollama en Kubernetes. Gracias por compartir tu conocimiento!

@SteveHarris-mi9ou 2 ай бұрын

It's a cool setup to use to run a RAG setup locally as well - nice going.

@zulhilmizainudin 2 ай бұрын

Looking forward for the next video!

@mathisve 2 ай бұрын

You can find the video here: kzbin.info/www/bejne/j6DQoGV6o7FshKM

@zulhilmizainudin 2 ай бұрын

@@mathisve thanks!

@JoeIrizarry88 Ай бұрын

Great Vid. In the future I recommend adding a license to your GitHub repo such as MIT. 😉 👍🏻

@Techonsapevole 2 ай бұрын

I use docker compose but i was curious about k8s

@beastmastern159 Ай бұрын

nice video, nvidia container toolkit only suported in lunux distribution so i installed a ubuntu booted whit my windows to make this work whit my gpu into kubernetes xdd but was good think i love this.

@Fayaz-Rehman Ай бұрын

Thank you very much for sharing. Can we save the model llamaL3.5 file in a persistent volume?

@Sentientforce 2 ай бұрын

Can you please advise how to run ollama in k3d cluster in wsl2- windows 11 and docker desktop environment. The issue I’m not able to solve is making gpu visible in a node.

@HosseinOjvar 2 ай бұрын

Helpful tutorial thank you

@MuhammadRehanAbbasi-j5w 2 ай бұрын

Would really like the video on how to add a GPU to this, both locally and on the cloud.

@mathisve 2 ай бұрын

Stay tuned for that video! I'm working on it as we speak, should be out later this week!

@mathisve 2 ай бұрын

I have uploaded a new video on how to run Ollama on a GPU in the cloud: kzbin.info/www/bejne/j6DQoGV6o7FshKM

@samson-olusegun 2 ай бұрын

Would using a k8s job to make the pull API call suffice?

@mathisve 2 ай бұрын

Yes and no! On paper, if you only had one pod this could work. But the API call needs to be made every time a new Ollama pod is scheduled (unless you're using a PVC mounted to the pod to store the model). As far as I'm aware it's not possible to start a Kubernetes job at the creation of a new pod without using an operator.

@ganeshnaiknavare2815 2 күн бұрын

I want to deploy the ollama model free can you please suggest me platform

@Tudorabil 2 ай бұрын

Would a postStart command `ollama pull llama3.2` work for the ollama pod, so you don't have to create the curl pod?

@mathisve 2 ай бұрын

I haven't tried it out but I think it might. Not sure how I didn't think of this

@domagalsky 2 ай бұрын

Why don't you just prebuild docker image with embeded llama model in it? It saves spin up time of the image, reduces complexity and amount of api calls to get the model per pod deploy.

@mathisve 2 ай бұрын

That's a possibility for sure. One downside of this is that depending on the size of the LLM, the images can get very big. In my opinion it's much more pragmatic to mount a PVC to the pods so they only have to download it once.

@domagalsky 2 ай бұрын

@@mathisve having shared volume ReadOnlyMany sounds the best option!

@enzanto 2 ай бұрын

Is there a reason why you do not use volume mounts? no need to DL the model each time

@mathisve 2 ай бұрын

That would indeed be the ideal scenario. Attach a rwx pvc to all ollama pods, that way they only have to pull the model once. I didn't include this in the video for simplicity sake

@enzanto 2 ай бұрын

@mathisve good to know, I "feared" that it was some other reason I could not think of. Going to set it up in kubernetes my self... A day I get the tile for it