Ollama's Newest Release and Model Breakdown
9:00
Doh! Let's clear up fine tuning
7:14
14 күн бұрын
Fine Tune a model with MLX for Ollama
8:40
AI News for Tuesday August 27, 2024
1:08:46
AI News for Monday August 26, 2024
41:47
AI News for August 23, 2024
46:35
AI News for August 22, 2024
43:59
Optimize Your AI Models
11:43
Ай бұрын
AI News for August 21 2024
35:48
AI News for 2024-08-20
36:24
Ай бұрын
AI News for 2024-08-19
51:38
Ай бұрын
AI News for 2024-08-18
1:04:54
Ай бұрын
AI News for 2024-08-17
30:52
Ай бұрын
AI News for 08/16/2024
1:00:55
Ай бұрын
Пікірлер
@gabemulero3962
@gabemulero3962 17 сағат бұрын
Thank you so much for taking the time to create this.
@tecnopadre
@tecnopadre 19 сағат бұрын
After the launch of llama 3.2 1B & 3B, this video should Skyrocket so Ollama
@emil8367
@emil8367 Күн бұрын
thanks Matt ! Is there any list of all env variables with description for each in the Ollama docs ?
@toadlguy
@toadlguy Күн бұрын
Ha, ha, ha. We understand how KZbin works. We either pay for YT Premium or we watch ads and you get paid based on views. You don’t need to announce it is FREE at the beginning of your video. (Thanks for the content, though😊)
@technovangelist
@technovangelist Күн бұрын
If that were true I wouldn’t be asked often if it would stay free. Lots put a teaser on KZbin then move the rest to a paid platform.
@willTryAgainTmrw
@willTryAgainTmrw Күн бұрын
What does "K_L/M/S" etc mean for quantized models? Why are L larger than M for same quantization?
@emmanuelgoldstein3682
@emmanuelgoldstein3682 Күн бұрын
I'm subscribed with all notifications turned on but I didn't get this one for some reason... ☹
@DodiInkoTariah
@DodiInkoTariah Күн бұрын
Thanks Matt for your videos. Please are you able to do one with instruction for installing LLama 3.2 11B. It will be very helpful for many people but no pressure.
@sadboi5672
@sadboi5672 Күн бұрын
isn't there no 11B model for 3.2? 3.2 only has 1B and 3B variants
@technovangelist
@technovangelist Күн бұрын
When it works I will. But there isn’t anything special with it.
@NLPprompter
@NLPprompter Күн бұрын
YES YES environment variables, the first time i need it for obsidian copilot. it was chat gpt and claude and discord teach me about this, I'm really sure this video is undoubtedly needed by many ollama user
@MaxJM74
@MaxJM74 Күн бұрын
tks 👍
@SonGoku-pc7jl
@SonGoku-pc7jl Күн бұрын
today llama3.2 3b have use tools :) saying... ;)
@technovangelist
@technovangelist Күн бұрын
llama 3 and 2 all work with tools. Every model works with tools and functions if you use them right
@SonGoku-pc7jl
@SonGoku-pc7jl Күн бұрын
"With this Mac code, I understand that ollama pull model downloads it again if it's updated, and if it's not updated, it doesn't download it again? Because I don't see any fetch in the code, and ChatGPT made me think otherwise, I don't know."
@technovangelist
@technovangelist Күн бұрын
ollama pull will get the latest version of any model if there is a newer version.
@SonGoku-pc7jl
@SonGoku-pc7jl Күн бұрын
thanks! :)
@akamalov
@akamalov 2 күн бұрын
you rock! Thank you!
@DrexxLaggui
@DrexxLaggui 2 күн бұрын
Most excellent vid sir! Can you expamd on this by showing how to make RAG perform faster at 25 tokens per second at least, with several GB or 1000's of md files uplloaded to it please?
@technovangelist
@technovangelist 2 күн бұрын
25 tokens per second seems pretty slow. I get double that on my 3 year old Mac.
@AZombieBitMyHand
@AZombieBitMyHand 2 күн бұрын
The "staring at my soul ending" is the besto route possible in a visual novel like this one. 🖤
@ricardofranco4114
@ricardofranco4114 3 күн бұрын
Can Llama AI tell me how to cheat in video games ? Cuh google ai wont. Also, can llama AI draw photos ?
@rtpHarry
@rtpHarry 3 күн бұрын
Ok so we just make the middle of the document spacer text and then leave the real info at the atart and end... 😅 16:11
@Leon-AlexisSauer
@Leon-AlexisSauer 3 күн бұрын
is it possible to get an graphical ui ?
@technovangelist
@technovangelist 3 күн бұрын
sure, there are a lot of choices on the ollama repo
@adamtegen
@adamtegen 3 күн бұрын
I'd LOVE the more integrated JSON mode that OpenAI has. I know you can put it ollama in JSON mode, but that's more of a hint that isn't always followed. It would be a GAME CHANGER! From OpenAIs announcements in August 6th: "We are introducing Structured Outputs in the API-model outputs now reliably adhere to developer-supplied JSON Schemas."
@technovangelist
@technovangelist 3 күн бұрын
What OpenAI added seems to catch up to tools like ollama. The tools and json support in ollama has been rock solid for a year. Set it up right and it works 100% of the time. Not a suggestion, but 100% reliable.
@Leon-AlexisSauer
@Leon-AlexisSauer 3 күн бұрын
yoo so far as i understand , ollama is not an application right? or is there a way to get it like that i am new to this
@technovangelist
@technovangelist 3 күн бұрын
Not sure I understand. Ollama is an application to run ai models
@lakraMundka
@lakraMundka 3 күн бұрын
I love what you have been sharing on this channel. As a developer I really enjoy the content like in this video which is not just a tutorial on a tool but knowledge that is deeper. Also less code and more control, yes please !
@technovangelist
@technovangelist 3 күн бұрын
Thanks for the comment and for being a member
@nigeldogg
@nigeldogg 4 күн бұрын
Would love to see further deep dive on this including hybrid keyword/semantic search and reranker for large datasets applied with an LLM via Ollama. Thanks for the great tutorial as always!
@AricRastley
@AricRastley 4 күн бұрын
Exactly what I was looking for! THANK YOU!
@rtpHarry
@rtpHarry 4 күн бұрын
Hate the rip off price of the internal drives in mbp. I did manage to convince myself it was a better investment though by pricing up the alternative costs of an equivalent google drive over its lifetime. When you max it out to 4tb its about 1200usd. The closest the ai just came up with is 10tb with google drive which is 49.99 a month. Based on the fact i have had this for about 5 years now that would have cost 2500usd. I had a better comparison before but chatgpt cant search its own conversation history and I cannot find it right now. So when i get the next upgrade, which i think will be the m4, i will pay their price and convince myself its ok using maths magic. 😅 And then i keep it all safe from dropping or being stolen with backblaze.
@jayd8935
@jayd8935 4 күн бұрын
Informative as always. Heh. I was just listening, not watching and after the goodbye I heard your water bottle. Remembering to hydrate can be hard, so thanks for the reminder.
@rtpHarry
@rtpHarry 4 күн бұрын
How to make truly random events? Just hook it up to a lava lamp! Google it. Lava lamps are actually being used to keep the internet secure. 😆 Basically there is a wall of lamps and a webcam pointing at it, and the fluctuations are used as the values for the random numbers.
@SonGoku-pc7jl
@SonGoku-pc7jl 4 күн бұрын
good video, my wish is that in install, i can select folder for ollama models, because in one pc i have few space in c: jaja And one ask, what tool for ocr you recommand for windows? thanks :) sorry for my english, im spanosh ;)
@technovangelist
@technovangelist 4 күн бұрын
Not sure what you are asking
@technovangelist
@technovangelist 3 күн бұрын
you can change the folder for ollama models. there is an environment variable to set.
@Reality_Check_1984
@Reality_Check_1984 4 күн бұрын
They need to add ollama stop to the python API. Could really use me some ollama.stop action in my applications.
@technovangelist
@technovangelist 4 күн бұрын
It’s been there since the beginning of the python lib. Just stop the generator.
@technovangelist
@technovangelist 4 күн бұрын
Sorry. Was thinking about stopping a generation mid stream. To unload the model is easy. Just set keep alive to 0. Same in Python and js.
@jgodvliet
@jgodvliet 4 күн бұрын
I have an issue timestamp 8:17 (I am on Windows 11 using the docker terminal) When I use the ollama_host=0.0.0.0:11434 docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama I get an error: The term 'ollama_host=0.0.0.0:11434' is not recognized as the name of a cmdlet, function, script file, or operable program. Check the spelling of the name, or if a path was incl uded, verify that the path is correct and try again. Any idea what went wrong?
@technovangelist
@technovangelist 4 күн бұрын
For windows, try setting it like this: github.com/ollama/ollama/blob/main/docs/faq.md#setting-environment-variables-on-windows
@jgodvliet
@jgodvliet 4 күн бұрын
Wow thanks Matt! Looking forward to the tutorials!
@MauricioDavid77
@MauricioDavid77 4 күн бұрын
Video Suggestion: Table-Augmented Generation (TAG). TAG is a unified and general-purpose paradigm for answering natural language questions over databases.
@azharalibhutto1209
@azharalibhutto1209 5 күн бұрын
❤❤❤❤❤❤❤❤❤ great
@MeinDeutschkurs
@MeinDeutschkurs 5 күн бұрын
🎉🎉🎉🎉 Reader-LM is Slow, it needs temperature 0 to work fine.
@Pregidth
@Pregidth 5 күн бұрын
Can you please show how to make Pixtral available for Ollama? Thank you
@technovangelist
@technovangelist 5 күн бұрын
It’s not. It’s a different architecture. If it’s ever possible it will be a while.
@marcusk7855
@marcusk7855 5 күн бұрын
Great tutorial. Can you do a non-mac version of this? I see things like qlora but I have no idea even where to start.
@technovangelist
@technovangelist 5 күн бұрын
I plan to. Unfortunately most of the non Mac tools kinda suck.
@pauledam2174
@pauledam2174 5 күн бұрын
He said that fine-tuning is only for dealing with how the model responds but it's also for increasing domain expertise as far as I know
@switch4636
@switch4636 5 күн бұрын
Thank you Matt for this,
@FranzAllanSee
@FranzAllanSee 5 күн бұрын
ollama stop finally!!! 😅 i no longer have to keep restarting ollama 😅😂
@user-sx6nd8zl5k
@user-sx6nd8zl5k 5 күн бұрын
I have a question because you collaborated with the Ollama project. Because Ollama is not compatible with more low-income models like AMD and models like Rx 580 or Rx 6600, while the LMStudio project is.
@technovangelist
@technovangelist 5 күн бұрын
It works with AMD just fine. What is the question? older models of amd are not supported because AMD chose to not support them with the newer drivers.
@Rolandfart
@Rolandfart 5 күн бұрын
no more systemctl restart ollama
@romayojr
@romayojr 5 күн бұрын
the long awkward pause at the end always gets me
@AINMEisONE
@AINMEisONE 5 күн бұрын
I am training the Pixtral model to further enhance creating interfaces. I downloaded it from a server that usually has these. I can say it is pretty good for OCR and screenshots to correct errors, or read scanned books.. Coupling this with Claude Dev or Engineer, super powers the whole process of building real complex apps... Thanks Mistral!
@basantrajshakti4272
@basantrajshakti4272 5 күн бұрын
It is just missing a fine-tuning command with a large custom dataset.
@juliovac2113
@juliovac2113 5 күн бұрын
PIXTRAL?
@technovangelist
@technovangelist 5 күн бұрын
Just barely been released and it’s a new architecture. If it can be supported it might be a while
@bobdowling6932
@bobdowling6932 6 күн бұрын
Would you regard the NER models as cases of fine-tuned models?
@serikazero128
@serikazero128 6 күн бұрын
what do you think about Ollama or other ways of running LLMs such as 8b models on smartphones?
@technovangelist
@technovangelist 6 күн бұрын
That’s the main magic feature of Apple intelligence. Finally makes it doable without sacrifice in battery life. At least that’s the promise. Every other approach has been terrible
@serikazero128
@serikazero128 5 күн бұрын
@@technovangelist the only problem I have with that is that I can't run my "custom" model. And that its apple, so I can't just write software and put in on my phone. While with android I can easily make an APK or an even in development software and load it on the phone. Thank you for answering btw :)
@technovangelist
@technovangelist 5 күн бұрын
You can totally write your own software for iOS. It wouldn’t be the strong platform it is if you couldn’t. You can run custom models today it’s just a terrible experience.
@serikazero128
@serikazero128 5 күн бұрын
@@technovangelist I was under the impression that it needs to be approved into apple store before I can load it into an iphone, was I mistaken for the past 3 years? ^^'
@technovangelist
@technovangelist 5 күн бұрын
If you are writing it for yourself then yup you are mistaken
@tririfandani1876
@tririfandani1876 6 күн бұрын
I'm new here, but I saw this content is really informative and I like that you're very knowledgeable so I subscribed
@ds920
@ds920 6 күн бұрын
My issue is that I need model loaded always, and I can’t find an obvious way of keeping it in memory for longer then 4 minutes:)
@technovangelist
@technovangelist 6 күн бұрын
Just set keep alive to -1
@abrahamsimonramirez2933
@abrahamsimonramirez2933 6 күн бұрын
I would be interested in using other inference engines with ollama but the documentation I've found doesn't have enough detail
@technovangelist
@technovangelist 6 күн бұрын
Can you explain what you mean by this?
@abrahamsimonramirez2933
@abrahamsimonramirez2933 4 күн бұрын
​the model loader or runner other than transformers(for example airllm, powerinfer, llama.mojo, exllama, lamacpp and other custom ones. ollama was supposed to have a config somewhere to point at the inference engine but probably it's some old info? 🤔​@@technovangelist
@MistikBBQ
@MistikBBQ 6 күн бұрын
Prompt caching and more support for vision models would be great...in any case nice update! Thanks
@technovangelist
@technovangelist 6 күн бұрын
Prompt caching isn’t something that would make sense for ollama. That would be something implemented by whatever tool you or someone else builds that leverages ollama.