Deploy ANY Open-Source LLM with Ollama on an AWS EC2 + GPU in 10 Min (Llama-3.1, Gemma-2 etc.)

Рет қаралды 5,235

Developers Digest

Күн бұрын

Пікірлер: 21

@DevelopersDigest 3 ай бұрын

The best way to support this channel? Comment, like, and subscribe!

@hpongpong 3 ай бұрын

Great concise presentation. Thank you so much!

@DevelopersDigest 3 ай бұрын

Thank you! 🙏

@ryanroman6589 3 ай бұрын

this is super valuable. awesome vid!

@DevelopersDigest 3 ай бұрын

Thank you! 🙏

@rembautimes8808 3 ай бұрын

Thanks very nice tutorial

@DevelopersDigest 3 ай бұрын

Thank you

@brunozwietisch 25 күн бұрын

I’m looking to learn how to use Llama. Do you have a minimum configuration for the 8B version to run? Because here in Brazil, the dollar-to-real exchange rate is 6 to 1, and by the end of the month, the budget gets tight for those who want to learn.

@DevelopersDigest 24 күн бұрын

Groq has a free tier for llama as well as cloudflare!

@alejandrogallardo1414 3 ай бұрын

for models at ~70b, i am getting timeout issues using vanilla ollama. It works with the first pull/run, but times out when i need to reload model. Do you have any recommendations for persistently keeping the same model running?

@DevelopersDigest 3 ай бұрын

github.com/ollama/ollama/pull/2146

@rehanshaikh2708 Ай бұрын

how can i use this endpoint in langchain chatollama?

@nexuslux 3 ай бұрын

Can you use open web ui?

@danielgannage8109 3 ай бұрын

This is very informative! Thanks :) Curious why you used a g4dn.xlarge GPU ($300/month) instead of a t3.medium CPU ($30/month)? I assumed the 8 Billion parameter model was out of reach with regular hardware. What max model size works with the g4dn.xlarge GPU? To put into perspective, I have a $4K macbook (16gb ram) that can really only run the large (150 million) or medium (100 million parameter) sized model, which i think the t3.medium CPU on AWS can only run the 50 million param (small model).