Matt Williams

Matt Williams

I was a founding maintainer of Ollama, the first evangelist at Datadog, and a former organizer of DevOps Days Seattle, DevOps Days Boston, and Serverless Days Boston. In my day job, I tour the country speaking at conferences and writing about the company I work at. But I am also passionate about gadgets. Some folks don't think that word is appropriate, but it totally fits here. If you meet me in person you will see me light up when I can share with you my latest gadget, tool, or utility. And this is where I get to share that with all of you. You can find more about me at my website (technovangelist.com) or on Twitter (twitter.com/technovangelist)

Ollama's Newest Release and Model Breakdown

9:00

Ollama's Newest Release and Model Breakdown

20 сағат бұрын

What are the different types of models - The Ollama Course

6:49

What are the different types of models - The Ollama Course

Күн бұрын

Let's build a RAG system - The Ollama Course

7:34

Let's build a RAG system - The Ollama Course

14 күн бұрын

Doh! Let's clear up fine tuning

7:14

Doh! Let's clear up fine tuning

14 күн бұрын

Matt takes a quick look at the IdeaGenerator

14:06

Matt takes a quick look at the IdeaGenerator

21 күн бұрын

Embeddings in Depth - Part of the Ollama Course

10:27

Embeddings in Depth - Part of the Ollama Course

21 күн бұрын

Fine Tune a model with MLX for Ollama

8:40

Fine Tune a model with MLX for Ollama

28 күн бұрын

An Introduction to RAG - Part of the Free Ollama Course

9:06

An Introduction to RAG - Part of the Free Ollama Course

28 күн бұрын

AI News for Tuesday August 27, 2024

1:08:46

AI News for Tuesday August 27, 2024

28 күн бұрын

AI News for Monday August 26, 2024

41:47

AI News for Monday August 26, 2024

28 күн бұрын

AI News for August 23, 2024

46:35

AI News for August 23, 2024

Ай бұрын

AI News for August 22, 2024

43:59

AI News for August 22, 2024

Ай бұрын

Optimize Your AI Models

11:43

Optimize Your AI Models

Ай бұрын

AI News for August 21 2024

35:48

AI News for August 21 2024

Ай бұрын

Comparing Quantizations of the Same Model - Ollama Course

10:29

Comparing Quantizations of the Same Model - Ollama Course

Ай бұрын

Matt and Ryan have a chat on 08/20/2024

55:45

Matt and Ryan have a chat on 08/20/2024

Ай бұрын

AI News for 2024-08-20

36:24

AI News for 2024-08-20

Ай бұрын

AI News for 2024-08-19

51:38

AI News for 2024-08-19

Ай бұрын

AI News for 2024-08-18

1:04:54

AI News for 2024-08-18

Ай бұрын

AI News for 2024-08-17

30:52

AI News for 2024-08-17

Ай бұрын

AI News for 08/16/2024

1:00:55

AI News for 08/16/2024

Ай бұрын

Can the Ollama API be slower than the CLI

4:11

Can the Ollama API be slower than the CLI

Ай бұрын

Technovangelist looks at the news - August 15 2024

1:05:35

Technovangelist looks at the news - August 15 2024

Ай бұрын

Matt and Ryan have a chat: The StackOverflow Dev Report - edit - from August 6, 2024

1:07:27

Matt and Ryan have a chat: The StackOverflow Dev Report - edit - from August 6, 2024

Ай бұрын

Ryan and Matt have a Chat - edit

52:56

Ryan and Matt have a Chat - edit

Ай бұрын

The Ollama Course - Using the CLI

10:09

The Ollama Course - Using the CLI

Ай бұрын

Docker Networking Made Simple: Connecting Containers Like a Pro

8:09

Docker Networking Made Simple: Connecting Containers Like a Pro

Ай бұрын

Ollama Course - 3 - How to use the Ollama.com site to Find Models

12:31

Ollama Course - 3 - How to use the Ollama.com site to Find Models

Ай бұрын

Another Function Calling video? Yup...

9:32

Another Function Calling video? Yup...

Ай бұрын

Пікірлер

@gabemulero3962

@gabemulero3962 17 сағат бұрын

Thank you so much for taking the time to create this.

@tecnopadre 19 сағат бұрын

After the launch of llama 3.2 1B & 3B, this video should Skyrocket so Ollama

@emil8367 Күн бұрын

thanks Matt ! Is there any list of all env variables with description for each in the Ollama docs ?

@toadlguy Күн бұрын

Ha, ha, ha. We understand how KZbin works. We either pay for YT Premium or we watch ads and you get paid based on views. You don’t need to announce it is FREE at the beginning of your video. (Thanks for the content, though😊)

@technovangelist

@technovangelist Күн бұрын

If that were true I wouldn’t be asked often if it would stay free. Lots put a teaser on KZbin then move the rest to a paid platform.

@willTryAgainTmrw

@willTryAgainTmrw Күн бұрын

What does "K_L/M/S" etc mean for quantized models? Why are L larger than M for same quantization?

@emmanuelgoldstein3682

@emmanuelgoldstein3682 Күн бұрын

I'm subscribed with all notifications turned on but I didn't get this one for some reason... ☹

@DodiInkoTariah

@DodiInkoTariah Күн бұрын

Thanks Matt for your videos. Please are you able to do one with instruction for installing LLama 3.2 11B. It will be very helpful for many people but no pressure.

@sadboi5672 Күн бұрын

isn't there no 11B model for 3.2? 3.2 only has 1B and 3B variants

@technovangelist

@technovangelist Күн бұрын

When it works I will. But there isn’t anything special with it.

@NLPprompter Күн бұрын

YES YES environment variables, the first time i need it for obsidian copilot. it was chat gpt and claude and discord teach me about this, I'm really sure this video is undoubtedly needed by many ollama user

@MaxJM74 Күн бұрын

tks 👍

@SonGoku-pc7jl Күн бұрын

today llama3.2 3b have use tools :) saying... ;)

@technovangelist

@technovangelist Күн бұрын

llama 3 and 2 all work with tools. Every model works with tools and functions if you use them right

@SonGoku-pc7jl Күн бұрын

"With this Mac code, I understand that ollama pull model downloads it again if it's updated, and if it's not updated, it doesn't download it again? Because I don't see any fetch in the code, and ChatGPT made me think otherwise, I don't know."

@technovangelist

@technovangelist Күн бұрын

ollama pull will get the latest version of any model if there is a newer version.

@SonGoku-pc7jl Күн бұрын

thanks! :)

@akamalov 2 күн бұрын

you rock! Thank you!

@DrexxLaggui 2 күн бұрын

Most excellent vid sir! Can you expamd on this by showing how to make RAG perform faster at 25 tokens per second at least, with several GB or 1000's of md files uplloaded to it please?

@technovangelist

@technovangelist 2 күн бұрын

25 tokens per second seems pretty slow. I get double that on my 3 year old Mac.

@AZombieBitMyHand

@AZombieBitMyHand 2 күн бұрын

The "staring at my soul ending" is the besto route possible in a visual novel like this one. 🖤

@ricardofranco4114

@ricardofranco4114 3 күн бұрын

Can Llama AI tell me how to cheat in video games ? Cuh google ai wont. Also, can llama AI draw photos ?

@rtpHarry 3 күн бұрын

Ok so we just make the middle of the document spacer text and then leave the real info at the atart and end... 😅 16:11

@Leon-AlexisSauer

@Leon-AlexisSauer 3 күн бұрын

is it possible to get an graphical ui ?

@technovangelist

@technovangelist 3 күн бұрын

sure, there are a lot of choices on the ollama repo

@adamtegen 3 күн бұрын

I'd LOVE the more integrated JSON mode that OpenAI has. I know you can put it ollama in JSON mode, but that's more of a hint that isn't always followed. It would be a GAME CHANGER! From OpenAIs announcements in August 6th: "We are introducing Structured Outputs in the API-model outputs now reliably adhere to developer-supplied JSON Schemas."

@technovangelist

@technovangelist 3 күн бұрын

What OpenAI added seems to catch up to tools like ollama. The tools and json support in ollama has been rock solid for a year. Set it up right and it works 100% of the time. Not a suggestion, but 100% reliable.

@Leon-AlexisSauer

@Leon-AlexisSauer 3 күн бұрын

yoo so far as i understand , ollama is not an application right? or is there a way to get it like that i am new to this

@technovangelist

@technovangelist 3 күн бұрын

Not sure I understand. Ollama is an application to run ai models

@lakraMundka 3 күн бұрын

I love what you have been sharing on this channel. As a developer I really enjoy the content like in this video which is not just a tutorial on a tool but knowledge that is deeper. Also less code and more control, yes please !

@technovangelist

@technovangelist 3 күн бұрын

Thanks for the comment and for being a member

@nigeldogg 4 күн бұрын

Would love to see further deep dive on this including hybrid keyword/semantic search and reranker for large datasets applied with an LLM via Ollama. Thanks for the great tutorial as always!

@AricRastley 4 күн бұрын

Exactly what I was looking for! THANK YOU!

@rtpHarry 4 күн бұрын

Hate the rip off price of the internal drives in mbp. I did manage to convince myself it was a better investment though by pricing up the alternative costs of an equivalent google drive over its lifetime. When you max it out to 4tb its about 1200usd. The closest the ai just came up with is 10tb with google drive which is 49.99 a month. Based on the fact i have had this for about 5 years now that would have cost 2500usd. I had a better comparison before but chatgpt cant search its own conversation history and I cannot find it right now. So when i get the next upgrade, which i think will be the m4, i will pay their price and convince myself its ok using maths magic. 😅 And then i keep it all safe from dropping or being stolen with backblaze.

@jayd8935 4 күн бұрын

Informative as always. Heh. I was just listening, not watching and after the goodbye I heard your water bottle. Remembering to hydrate can be hard, so thanks for the reminder.

@rtpHarry 4 күн бұрын

How to make truly random events? Just hook it up to a lava lamp! Google it. Lava lamps are actually being used to keep the internet secure. 😆 Basically there is a wall of lamps and a webcam pointing at it, and the fluctuations are used as the values for the random numbers.

@SonGoku-pc7jl 4 күн бұрын

good video, my wish is that in install, i can select folder for ollama models, because in one pc i have few space in c: jaja And one ask, what tool for ocr you recommand for windows? thanks :) sorry for my english, im spanosh ;)

@technovangelist

@technovangelist 4 күн бұрын

Not sure what you are asking

@technovangelist

@technovangelist 3 күн бұрын

you can change the folder for ollama models. there is an environment variable to set.

@Reality_Check_1984

@Reality_Check_1984 4 күн бұрын

They need to add ollama stop to the python API. Could really use me some ollama.stop action in my applications.

@technovangelist

@technovangelist 4 күн бұрын

It’s been there since the beginning of the python lib. Just stop the generator.

@technovangelist

@technovangelist 4 күн бұрын

Sorry. Was thinking about stopping a generation mid stream. To unload the model is easy. Just set keep alive to 0. Same in Python and js.

@jgodvliet 4 күн бұрын

I have an issue timestamp 8:17 (I am on Windows 11 using the docker terminal) When I use the ollama_host=0.0.0.0:11434 docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama I get an error: The term 'ollama_host=0.0.0.0:11434' is not recognized as the name of a cmdlet, function, script file, or operable program. Check the spelling of the name, or if a path was incl uded, verify that the path is correct and try again. Any idea what went wrong?

@technovangelist

@technovangelist 4 күн бұрын

For windows, try setting it like this: github.com/ollama/ollama/blob/main/docs/faq.md#setting-environment-variables-on-windows

@jgodvliet 4 күн бұрын

Wow thanks Matt! Looking forward to the tutorials!

@MauricioDavid77

@MauricioDavid77 4 күн бұрын

Video Suggestion: Table-Augmented Generation (TAG). TAG is a unified and general-purpose paradigm for answering natural language questions over databases.

@azharalibhutto1209

@azharalibhutto1209 5 күн бұрын

❤❤❤❤❤❤❤❤❤ great

@MeinDeutschkurs

@MeinDeutschkurs 5 күн бұрын

🎉🎉🎉🎉 Reader-LM is Slow, it needs temperature 0 to work fine.

@Pregidth 5 күн бұрын

Can you please show how to make Pixtral available for Ollama? Thank you

@technovangelist

@technovangelist 5 күн бұрын

It’s not. It’s a different architecture. If it’s ever possible it will be a while.

@marcusk7855 5 күн бұрын

Great tutorial. Can you do a non-mac version of this? I see things like qlora but I have no idea even where to start.

@technovangelist

@technovangelist 5 күн бұрын

I plan to. Unfortunately most of the non Mac tools kinda suck.

@pauledam2174 5 күн бұрын

He said that fine-tuning is only for dealing with how the model responds but it's also for increasing domain expertise as far as I know

@switch4636 5 күн бұрын

Thank you Matt for this,

@FranzAllanSee 5 күн бұрын

ollama stop finally!!! 😅 i no longer have to keep restarting ollama 😅😂

@user-sx6nd8zl5k

@user-sx6nd8zl5k 5 күн бұрын

I have a question because you collaborated with the Ollama project. Because Ollama is not compatible with more low-income models like AMD and models like Rx 580 or Rx 6600, while the LMStudio project is.

@technovangelist

@technovangelist 5 күн бұрын

It works with AMD just fine. What is the question? older models of amd are not supported because AMD chose to not support them with the newer drivers.

@Rolandfart 5 күн бұрын

no more systemctl restart ollama

@romayojr 5 күн бұрын

the long awkward pause at the end always gets me

@AINMEisONE 5 күн бұрын

I am training the Pixtral model to further enhance creating interfaces. I downloaded it from a server that usually has these. I can say it is pretty good for OCR and screenshots to correct errors, or read scanned books.. Coupling this with Claude Dev or Engineer, super powers the whole process of building real complex apps... Thanks Mistral!

@basantrajshakti4272

@basantrajshakti4272 5 күн бұрын

It is just missing a fine-tuning command with a large custom dataset.

@juliovac2113 5 күн бұрын

PIXTRAL?

@technovangelist

@technovangelist 5 күн бұрын

Just barely been released and it’s a new architecture. If it can be supported it might be a while

@bobdowling6932

@bobdowling6932 6 күн бұрын

Would you regard the NER models as cases of fine-tuned models?

@serikazero128 6 күн бұрын

what do you think about Ollama or other ways of running LLMs such as 8b models on smartphones?

@technovangelist

@technovangelist 6 күн бұрын

That’s the main magic feature of Apple intelligence. Finally makes it doable without sacrifice in battery life. At least that’s the promise. Every other approach has been terrible

@serikazero128 5 күн бұрын

@@technovangelist the only problem I have with that is that I can't run my "custom" model. And that its apple, so I can't just write software and put in on my phone. While with android I can easily make an APK or an even in development software and load it on the phone. Thank you for answering btw :)

@technovangelist

@technovangelist 5 күн бұрын

You can totally write your own software for iOS. It wouldn’t be the strong platform it is if you couldn’t. You can run custom models today it’s just a terrible experience.

@serikazero128 5 күн бұрын

@@technovangelist I was under the impression that it needs to be approved into apple store before I can load it into an iphone, was I mistaken for the past 3 years? ^^'

@technovangelist

@technovangelist 5 күн бұрын

If you are writing it for yourself then yup you are mistaken

@tririfandani1876

@tririfandani1876 6 күн бұрын

I'm new here, but I saw this content is really informative and I like that you're very knowledgeable so I subscribed

@ds920 6 күн бұрын

My issue is that I need model loaded always, and I can’t find an obvious way of keeping it in memory for longer then 4 minutes:)

@technovangelist

@technovangelist 6 күн бұрын

Just set keep alive to -1

@abrahamsimonramirez2933

@abrahamsimonramirez2933 6 күн бұрын

I would be interested in using other inference engines with ollama but the documentation I've found doesn't have enough detail

@technovangelist

@technovangelist 6 күн бұрын

Can you explain what you mean by this?

@abrahamsimonramirez2933

@abrahamsimonramirez2933 4 күн бұрын

the model loader or runner other than transformers(for example airllm, powerinfer, llama.mojo, exllama, lamacpp and other custom ones. ollama was supposed to have a config somewhere to point at the inference engine but probably it's some old info? 🤔@@technovangelist

@MistikBBQ 6 күн бұрын

Prompt caching and more support for vision models would be great...in any case nice update! Thanks

@technovangelist

@technovangelist 6 күн бұрын

Prompt caching isn’t something that would make sense for ollama. That would be something implemented by whatever tool you or someone else builds that leverages ollama.