Llama 3.2 3b Review Self Hosted Ai Testing on Ollama - Open Source LLM Review

  Рет қаралды 9,012

Digital Spaceport

Digital Spaceport

Күн бұрын

Пікірлер: 46
@DigitalSpaceport
@DigitalSpaceport 21 күн бұрын
AI Hardware Writeup digitalspaceport.com/homelab-ai-server-rig-tips-tricks-gotchas-and-takeaways
@enermaxstephens1051
@enermaxstephens1051 Ай бұрын
There are things like GPT4all which allows you to use the CPU instead of the RAM, for people who don't have a big bad video card
@jasonabc
@jasonabc 2 ай бұрын
Did one test uploaded a pdf and asked it to summarize the document. It spit out jibberish not even the same topic as the paper. So obviously how could i begin to trust anything from this if it fails something very simple.
@DigitalSpaceport
@DigitalSpaceport 2 ай бұрын
It does seem to be pretty bad frankly
@nathantruitt2740
@nathantruitt2740 Ай бұрын
Honestly with Arbitrary Number Arrays it took an interesting approach to an otherwise confusing question. A is 65 in ASCII decimal format :)
@shuntera
@shuntera 2 ай бұрын
What are you running in the powershell in the top right that shows your GPU status?
@DigitalSpaceport
@DigitalSpaceport 2 ай бұрын
Im actually ssh'd into the host computer Guide Here: kzbin.info/www/bejne/ip6xhHehn6mHhdU and running NVTOP
@marhensa
@marhensa 17 күн бұрын
why there's no Llama 3.2 8b ?
@vfxvision723
@vfxvision723 Ай бұрын
Tnx for the video, what do you use to use AI models locally?
@DigitalSpaceport
@DigitalSpaceport Ай бұрын
The software I use? Im not clear on what you are asking, sorry.
@Ikhsaints
@Ikhsaints Ай бұрын
@@DigitalSpaceport Maybe he means like LM Studio or other.
@IlllIlllIlllIlll
@IlllIlllIlllIlll 2 ай бұрын
16:00 synthetic as in synthetic data?
@DigitalSpaceport
@DigitalSpaceport 2 ай бұрын
Synthetic benchmarks like MMLU
@ugurinanc5177
@ugurinanc5177 2 ай бұрын
How can you quantize that model? And also can we fine tuning a model we downloaded from ollama?
@DigitalSpaceport
@DigitalSpaceport 2 ай бұрын
This is a good fine tune starting jumpoff point. It answers your questions and in short you can and the -q flag in ollama can create your specific quant desired. The huggingface fine tune adapter framework and safetensor adapter for Llama3.1 is a thing so I would assume it is for 3.2 as well. That's a good starting point for you. github.com/ollama/ollama/blob/main/docs/import.md
@omarnahdi3380
@omarnahdi3380 Ай бұрын
Hey i don't get markdown format of the code in the output. Can you explain how did you get it?
@DigitalSpaceport
@DigitalSpaceport Ай бұрын
Llama 3.2 has an issue with creating valid md it seems from repeated attempts. You should try qwen 2.5 or another model. Im not sure how to fix bad outputs.
@ElementX32
@ElementX32 2 ай бұрын
Do you think I can run this on the desktop I just bought which is the Digital Storm Velox with one 4090 gpu, 128gb of memory and the AMD Ryzen 9 9950X (16-Core) 5.7 GHz Turbo (Zen 5)? I know multiple gpu's is preferable, however that's beyond my budget at the moment.
@BrennanMetzelaar
@BrennanMetzelaar 2 ай бұрын
You should be able to run 3b easy. There is a website that shows recommended specs for Llama 3.2 and you surpass it easy (for 3b). Type something like 'Llama 3.2 Requirements' in a search engine. I thought it was Meta's website, but it looks like it is an enthusiast website. I have a Ryzen 5800x and a Rtx 3080 which is worse than you, and run it just fine.
@DigitalSpaceport
@DigitalSpaceport 2 ай бұрын
yes you can run this in windows. Windows does impact performance negatively a bit but you should be easy able to run a 22b or smaller model on that card at a decent quant level. I would not go less then 4 myself.
@hookflash699
@hookflash699 2 ай бұрын
Nice vid! What UI are you using?
@jakohema8429
@jakohema8429 2 ай бұрын
@@hookflash699 openweb ui
@thepulpfloyd
@thepulpfloyd 2 ай бұрын
He's using Open WebUI 👍🏻
@DavidVincentSSM
@DavidVincentSSM 2 ай бұрын
i agree that the results don't seem to match benchmarks in real world performance.. maybe something everyone else is missing?
@DigitalSpaceport
@DigitalSpaceport 2 ай бұрын
@@DavidVincentSSM Im not trying to larp as a pro or anything but I am interested in what makes for a good product. Im thinking less and less benchmarks make for a good product.
@zurgmuckerberg
@zurgmuckerberg 2 ай бұрын
will you test the 11b one?
@DigitalSpaceport
@DigitalSpaceport 2 ай бұрын
okay
@junkzencka
@junkzencka 2 ай бұрын
Models answer =: "Here's how some letters relate to numbers using ASCII" M = 77 S = 83 Z = 90 Answer is correct when using the ASCII table. It even tells you what it's doing! Look up the ASCII table
@fork407
@fork407 Ай бұрын
no, because it was given the A=0 in the context.
@michaelwpannekoek
@michaelwpannekoek 2 ай бұрын
What *top program are you running under WSL2?
@DigitalSpaceport
@DigitalSpaceport 2 ай бұрын
htop and nvtop but not sure nvtop runs under wsl2
@DeepThinker193
@DeepThinker193 2 ай бұрын
This is the result of AI inbreeding aka training on Synthetic data. I have a prompt that gets counting etc consistently accurate on llama 3.1 8b. However, on the 3.2 models they get things wrong all the time.
@DigitalSpaceport
@DigitalSpaceport 2 ай бұрын
Oh that's a great term! Ai inbreeding 😅
@mayankmaurya8631
@mayankmaurya8631 2 ай бұрын
You sure it wasn't because of 3b vs 8b ?
@DeepThinker193
@DeepThinker193 2 ай бұрын
@@mayankmaurya8631 Nope, tried my prompts on the 3.2 11b and 3.2 90b as well. They's just inferior and keep getting things wrong. I get consistently correct responses from llama 3.1 8b using my special prompts.
@bbertram2
@bbertram2 2 ай бұрын
It got it right for me... (or so I thought, lol) If A is equal to number 0, what is the number of M, S and z? llama3.2:latest Based on the standard ordering of the alphabet, where A is indeed equal to 0: M = 13 S = 19 Z = 26 Let me know if you have any further questions!
@DigitalSpaceport
@DigitalSpaceport 2 ай бұрын
If A=0, then B=1...M=12...S=18...z=25
@bbertram2
@bbertram2 2 ай бұрын
@@DigitalSpaceport hahahaha....yeah. Oh well.
@DigitalSpaceport
@DigitalSpaceport 2 ай бұрын
I was surprised at a few other models that did get this right and it tracks closely also to counting letters question also I've noticed.
@bbertram2
@bbertram2 2 ай бұрын
I went back and asked it again, same answer. Tried 3.1, could not get it either. However I gave it one clue A=0 and B=1 and boom it got it. Probably too easy of a clue but I'm surprised it could not answer it. I asked Claude and ChatGPT...they couldn't get it either....very odd. Good question!
@DigitalSpaceport
@DigitalSpaceport 2 ай бұрын
Qwen 2.5 here - kzbin.info/www/bejne/moDVmHykZ9-nrLssi=mb33EAbMjXk55YC3&t=555
@_s.i.s.u.
@_s.i.s.u. 2 ай бұрын
You're prompting the model wrong. The "strawberry" tests fail due to the tokenization methods of the given model. Prompt it as if you wanted to place the sentence into an array, then ask your third word second letter. It won't fail.
@DigitalSpaceport
@DigitalSpaceport 2 ай бұрын
@@_s.i.s.u. Other models I've tested, including qwen 2.5, can and do nail that exact question. Copy pasted. If a question has to be asked in a specific way to elicit a correct response, that is a failure.
@mcunumberone613
@mcunumberone613 2 ай бұрын
Is there any possibility to earn money with this model?
Qwen Just Casually Started the Local AI Revolution
16:05
Cole Medin
Рет қаралды 115 М.
My scorpion was taken away from me 😢
00:55
TyphoonFast 5
Рет қаралды 2,7 МЛН
小丑女COCO的审判。#天使 #小丑 #超人不会飞
00:53
超人不会飞
Рет қаралды 16 МЛН
How Strong Is Tape?
00:24
Stokes Twins
Рет қаралды 96 МЛН
NVIDIA Nemotron 70b Local AI Testing - The BEST Open Source LLM?
21:47
Digital Spaceport
Рет қаралды 17 М.
Local GraphRAG with LLaMa 3.1 - LangChain, Ollama & Neo4j
15:01
Coding Crash Courses
Рет қаралды 33 М.
Anthropic MCP + Ollama. No Claude Needed? Check it out!
18:06
What The Func? w/ Ed Zynda
Рет қаралды 8 М.
Qwen QwQ 2.5 32B Ollama Local AI Server Benchmarked w/ Cuda vs Apple M4 MLX
26:28
Anthropic MCP with Ollama, No Claude? Watch This!
29:55
Chris Hay
Рет қаралды 13 М.
LLAMA 3.1 70b GPU Requirements (FP32, FP16, INT8 and INT4)
5:15
Llama 3.2 VISION Tested - Shockingly Censored! 🤬
7:58
Matthew Berman
Рет қаралды 63 М.
This Llama 3 is powerful and uncensored, let’s run it
14:58
David Ondrej
Рет қаралды 178 М.
TULU 3 70B LLM Model Testing on Quad 3090 Local Ai Server
15:33
Digital Spaceport
Рет қаралды 2,7 М.
iPhone vs Nokia ☠️ #trollface #edit #troll
0:35
HISTORYBORZ
Рет қаралды 4,6 МЛН
Which one made you like this video?#keyboard
0:32
Tapkx
Рет қаралды 10 МЛН
СДЕЛАЙ ТАК . Пульт будет работать вечно
9:39
Мужские интересы 79
Рет қаралды 604 М.