Thank you so much for taking the time to create this.
@tecnopadre19 сағат бұрын
After the launch of llama 3.2 1B & 3B, this video should Skyrocket so Ollama
@emil8367Күн бұрын
thanks Matt ! Is there any list of all env variables with description for each in the Ollama docs ?
@toadlguyКүн бұрын
Ha, ha, ha. We understand how KZbin works. We either pay for YT Premium or we watch ads and you get paid based on views. You don’t need to announce it is FREE at the beginning of your video. (Thanks for the content, though😊)
@technovangelistКүн бұрын
If that were true I wouldn’t be asked often if it would stay free. Lots put a teaser on KZbin then move the rest to a paid platform.
@willTryAgainTmrwКүн бұрын
What does "K_L/M/S" etc mean for quantized models? Why are L larger than M for same quantization?
@emmanuelgoldstein3682Күн бұрын
I'm subscribed with all notifications turned on but I didn't get this one for some reason... ☹
@DodiInkoTariahКүн бұрын
Thanks Matt for your videos. Please are you able to do one with instruction for installing LLama 3.2 11B. It will be very helpful for many people but no pressure.
@sadboi5672Күн бұрын
isn't there no 11B model for 3.2? 3.2 only has 1B and 3B variants
@technovangelistКүн бұрын
When it works I will. But there isn’t anything special with it.
@NLPprompterКүн бұрын
YES YES environment variables, the first time i need it for obsidian copilot. it was chat gpt and claude and discord teach me about this, I'm really sure this video is undoubtedly needed by many ollama user
@MaxJM74Күн бұрын
tks 👍
@SonGoku-pc7jlКүн бұрын
today llama3.2 3b have use tools :) saying... ;)
@technovangelistКүн бұрын
llama 3 and 2 all work with tools. Every model works with tools and functions if you use them right
@SonGoku-pc7jlКүн бұрын
"With this Mac code, I understand that ollama pull model downloads it again if it's updated, and if it's not updated, it doesn't download it again? Because I don't see any fetch in the code, and ChatGPT made me think otherwise, I don't know."
@technovangelistКүн бұрын
ollama pull will get the latest version of any model if there is a newer version.
@SonGoku-pc7jlКүн бұрын
thanks! :)
@akamalov2 күн бұрын
you rock! Thank you!
@DrexxLaggui2 күн бұрын
Most excellent vid sir! Can you expamd on this by showing how to make RAG perform faster at 25 tokens per second at least, with several GB or 1000's of md files uplloaded to it please?
@technovangelist2 күн бұрын
25 tokens per second seems pretty slow. I get double that on my 3 year old Mac.
@AZombieBitMyHand2 күн бұрын
The "staring at my soul ending" is the besto route possible in a visual novel like this one. 🖤
@ricardofranco41143 күн бұрын
Can Llama AI tell me how to cheat in video games ? Cuh google ai wont. Also, can llama AI draw photos ?
@rtpHarry3 күн бұрын
Ok so we just make the middle of the document spacer text and then leave the real info at the atart and end... 😅 16:11
@Leon-AlexisSauer3 күн бұрын
is it possible to get an graphical ui ?
@technovangelist3 күн бұрын
sure, there are a lot of choices on the ollama repo
@adamtegen3 күн бұрын
I'd LOVE the more integrated JSON mode that OpenAI has. I know you can put it ollama in JSON mode, but that's more of a hint that isn't always followed. It would be a GAME CHANGER! From OpenAIs announcements in August 6th: "We are introducing Structured Outputs in the API-model outputs now reliably adhere to developer-supplied JSON Schemas."
@technovangelist3 күн бұрын
What OpenAI added seems to catch up to tools like ollama. The tools and json support in ollama has been rock solid for a year. Set it up right and it works 100% of the time. Not a suggestion, but 100% reliable.
@Leon-AlexisSauer3 күн бұрын
yoo so far as i understand , ollama is not an application right? or is there a way to get it like that i am new to this
@technovangelist3 күн бұрын
Not sure I understand. Ollama is an application to run ai models
@lakraMundka3 күн бұрын
I love what you have been sharing on this channel. As a developer I really enjoy the content like in this video which is not just a tutorial on a tool but knowledge that is deeper. Also less code and more control, yes please !
@technovangelist3 күн бұрын
Thanks for the comment and for being a member
@nigeldogg4 күн бұрын
Would love to see further deep dive on this including hybrid keyword/semantic search and reranker for large datasets applied with an LLM via Ollama. Thanks for the great tutorial as always!
@AricRastley4 күн бұрын
Exactly what I was looking for! THANK YOU!
@rtpHarry4 күн бұрын
Hate the rip off price of the internal drives in mbp. I did manage to convince myself it was a better investment though by pricing up the alternative costs of an equivalent google drive over its lifetime. When you max it out to 4tb its about 1200usd. The closest the ai just came up with is 10tb with google drive which is 49.99 a month. Based on the fact i have had this for about 5 years now that would have cost 2500usd. I had a better comparison before but chatgpt cant search its own conversation history and I cannot find it right now. So when i get the next upgrade, which i think will be the m4, i will pay their price and convince myself its ok using maths magic. 😅 And then i keep it all safe from dropping or being stolen with backblaze.
@jayd89354 күн бұрын
Informative as always. Heh. I was just listening, not watching and after the goodbye I heard your water bottle. Remembering to hydrate can be hard, so thanks for the reminder.
@rtpHarry4 күн бұрын
How to make truly random events? Just hook it up to a lava lamp! Google it. Lava lamps are actually being used to keep the internet secure. 😆 Basically there is a wall of lamps and a webcam pointing at it, and the fluctuations are used as the values for the random numbers.
@SonGoku-pc7jl4 күн бұрын
good video, my wish is that in install, i can select folder for ollama models, because in one pc i have few space in c: jaja And one ask, what tool for ocr you recommand for windows? thanks :) sorry for my english, im spanosh ;)
@technovangelist4 күн бұрын
Not sure what you are asking
@technovangelist3 күн бұрын
you can change the folder for ollama models. there is an environment variable to set.
@Reality_Check_19844 күн бұрын
They need to add ollama stop to the python API. Could really use me some ollama.stop action in my applications.
@technovangelist4 күн бұрын
It’s been there since the beginning of the python lib. Just stop the generator.
@technovangelist4 күн бұрын
Sorry. Was thinking about stopping a generation mid stream. To unload the model is easy. Just set keep alive to 0. Same in Python and js.
@jgodvliet4 күн бұрын
I have an issue timestamp 8:17 (I am on Windows 11 using the docker terminal) When I use the ollama_host=0.0.0.0:11434 docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama I get an error: The term 'ollama_host=0.0.0.0:11434' is not recognized as the name of a cmdlet, function, script file, or operable program. Check the spelling of the name, or if a path was incl uded, verify that the path is correct and try again. Any idea what went wrong?
@technovangelist4 күн бұрын
For windows, try setting it like this: github.com/ollama/ollama/blob/main/docs/faq.md#setting-environment-variables-on-windows
@jgodvliet4 күн бұрын
Wow thanks Matt! Looking forward to the tutorials!
@MauricioDavid774 күн бұрын
Video Suggestion: Table-Augmented Generation (TAG). TAG is a unified and general-purpose paradigm for answering natural language questions over databases.
@azharalibhutto12095 күн бұрын
❤❤❤❤❤❤❤❤❤ great
@MeinDeutschkurs5 күн бұрын
🎉🎉🎉🎉 Reader-LM is Slow, it needs temperature 0 to work fine.
@Pregidth5 күн бұрын
Can you please show how to make Pixtral available for Ollama? Thank you
@technovangelist5 күн бұрын
It’s not. It’s a different architecture. If it’s ever possible it will be a while.
@marcusk78555 күн бұрын
Great tutorial. Can you do a non-mac version of this? I see things like qlora but I have no idea even where to start.
@technovangelist5 күн бұрын
I plan to. Unfortunately most of the non Mac tools kinda suck.
@pauledam21745 күн бұрын
He said that fine-tuning is only for dealing with how the model responds but it's also for increasing domain expertise as far as I know
@switch46365 күн бұрын
Thank you Matt for this,
@FranzAllanSee5 күн бұрын
ollama stop finally!!! 😅 i no longer have to keep restarting ollama 😅😂
@user-sx6nd8zl5k5 күн бұрын
I have a question because you collaborated with the Ollama project. Because Ollama is not compatible with more low-income models like AMD and models like Rx 580 or Rx 6600, while the LMStudio project is.
@technovangelist5 күн бұрын
It works with AMD just fine. What is the question? older models of amd are not supported because AMD chose to not support them with the newer drivers.
@Rolandfart5 күн бұрын
no more systemctl restart ollama
@romayojr5 күн бұрын
the long awkward pause at the end always gets me
@AINMEisONE5 күн бұрын
I am training the Pixtral model to further enhance creating interfaces. I downloaded it from a server that usually has these. I can say it is pretty good for OCR and screenshots to correct errors, or read scanned books.. Coupling this with Claude Dev or Engineer, super powers the whole process of building real complex apps... Thanks Mistral!
@basantrajshakti42725 күн бұрын
It is just missing a fine-tuning command with a large custom dataset.
@juliovac21135 күн бұрын
PIXTRAL?
@technovangelist5 күн бұрын
Just barely been released and it’s a new architecture. If it can be supported it might be a while
@bobdowling69326 күн бұрын
Would you regard the NER models as cases of fine-tuned models?
@serikazero1286 күн бұрын
what do you think about Ollama or other ways of running LLMs such as 8b models on smartphones?
@technovangelist6 күн бұрын
That’s the main magic feature of Apple intelligence. Finally makes it doable without sacrifice in battery life. At least that’s the promise. Every other approach has been terrible
@serikazero1285 күн бұрын
@@technovangelist the only problem I have with that is that I can't run my "custom" model. And that its apple, so I can't just write software and put in on my phone. While with android I can easily make an APK or an even in development software and load it on the phone. Thank you for answering btw :)
@technovangelist5 күн бұрын
You can totally write your own software for iOS. It wouldn’t be the strong platform it is if you couldn’t. You can run custom models today it’s just a terrible experience.
@serikazero1285 күн бұрын
@@technovangelist I was under the impression that it needs to be approved into apple store before I can load it into an iphone, was I mistaken for the past 3 years? ^^'
@technovangelist5 күн бұрын
If you are writing it for yourself then yup you are mistaken
@tririfandani18766 күн бұрын
I'm new here, but I saw this content is really informative and I like that you're very knowledgeable so I subscribed
@ds9206 күн бұрын
My issue is that I need model loaded always, and I can’t find an obvious way of keeping it in memory for longer then 4 minutes:)
@technovangelist6 күн бұрын
Just set keep alive to -1
@abrahamsimonramirez29336 күн бұрын
I would be interested in using other inference engines with ollama but the documentation I've found doesn't have enough detail
@technovangelist6 күн бұрын
Can you explain what you mean by this?
@abrahamsimonramirez29334 күн бұрын
the model loader or runner other than transformers(for example airllm, powerinfer, llama.mojo, exllama, lamacpp and other custom ones. ollama was supposed to have a config somewhere to point at the inference engine but probably it's some old info? 🤔@@technovangelist
@MistikBBQ6 күн бұрын
Prompt caching and more support for vision models would be great...in any case nice update! Thanks
@technovangelist6 күн бұрын
Prompt caching isn’t something that would make sense for ollama. That would be something implemented by whatever tool you or someone else builds that leverages ollama.