Deploy LLMs using Serverless vLLM on RunPod in 5 Minutes

Рет қаралды 5,700

AI Anytime

Күн бұрын

Пікірлер: 16

@Bluedrake42 2 ай бұрын

Finally a tutorial that isn't awful. Thank you for existing.

@pagadalasumanth7969 Ай бұрын

Oh bro I feel you so much !

@udaykiran2053 3 ай бұрын

as you are using the llama model , what is the need for OpenAI installed to check it in the colab Notebook , can you explain

@nishalk781 2 ай бұрын

I think he's using openai model for its functions, like that module has stream which will make things easier for you if u need to receive text has chunks, instead of entire text.

@matthewchung74 3 ай бұрын

Serverless on runpod with a bigger model, like llama70b on multiple gpus would be awesome!

@AIAnytime 3 ай бұрын

Coming soon 🔜

@renwar_G 22 күн бұрын

Great video G

@AIAnytime 21 күн бұрын

Appreciate it

@jamesalxl3636 2 ай бұрын

im trying to run a 70B uncensored model, will this be possible with this method?

@SohamBasu-b1x Ай бұрын

can we set automated pause and resume in runpod endpoints ? like I want it to run for 3 hours per day in the morning? Can I set that up?

@premierleaguehighlights9061 3 ай бұрын

Can i use deepfacelab on runpod?

@frag_it 3 ай бұрын

Bro do one for azure Kubernetes with vllm

@AIAnytime 3 ай бұрын

Coming soon

@frag_it 3 ай бұрын

@@AIAnytime make sure you do a in depth guide would be awesome to learn and apply the llama 3.1 405 B on it. You can even make it a longer playlist ppl would go crazy over it