AI Hardware Writeup digitalspaceport.com/homelab-ai-server-rig-tips-tricks-gotchas-and-takeaways
@enermaxstephens1051Ай бұрын
There are things like GPT4all which allows you to use the CPU instead of the RAM, for people who don't have a big bad video card
@jasonabc2 ай бұрын
Did one test uploaded a pdf and asked it to summarize the document. It spit out jibberish not even the same topic as the paper. So obviously how could i begin to trust anything from this if it fails something very simple.
@DigitalSpaceport2 ай бұрын
It does seem to be pretty bad frankly
@nathantruitt2740Ай бұрын
Honestly with Arbitrary Number Arrays it took an interesting approach to an otherwise confusing question. A is 65 in ASCII decimal format :)
@shuntera2 ай бұрын
What are you running in the powershell in the top right that shows your GPU status?
@DigitalSpaceport2 ай бұрын
Im actually ssh'd into the host computer Guide Here: kzbin.info/www/bejne/ip6xhHehn6mHhdU and running NVTOP
@marhensa17 күн бұрын
why there's no Llama 3.2 8b ?
@vfxvision723Ай бұрын
Tnx for the video, what do you use to use AI models locally?
@DigitalSpaceportАй бұрын
The software I use? Im not clear on what you are asking, sorry.
@IkhsaintsАй бұрын
@@DigitalSpaceport Maybe he means like LM Studio or other.
@IlllIlllIlllIlll2 ай бұрын
16:00 synthetic as in synthetic data?
@DigitalSpaceport2 ай бұрын
Synthetic benchmarks like MMLU
@ugurinanc51772 ай бұрын
How can you quantize that model? And also can we fine tuning a model we downloaded from ollama?
@DigitalSpaceport2 ай бұрын
This is a good fine tune starting jumpoff point. It answers your questions and in short you can and the -q flag in ollama can create your specific quant desired. The huggingface fine tune adapter framework and safetensor adapter for Llama3.1 is a thing so I would assume it is for 3.2 as well. That's a good starting point for you. github.com/ollama/ollama/blob/main/docs/import.md
@omarnahdi3380Ай бұрын
Hey i don't get markdown format of the code in the output. Can you explain how did you get it?
@DigitalSpaceportАй бұрын
Llama 3.2 has an issue with creating valid md it seems from repeated attempts. You should try qwen 2.5 or another model. Im not sure how to fix bad outputs.
@ElementX322 ай бұрын
Do you think I can run this on the desktop I just bought which is the Digital Storm Velox with one 4090 gpu, 128gb of memory and the AMD Ryzen 9 9950X (16-Core) 5.7 GHz Turbo (Zen 5)? I know multiple gpu's is preferable, however that's beyond my budget at the moment.
@BrennanMetzelaar2 ай бұрын
You should be able to run 3b easy. There is a website that shows recommended specs for Llama 3.2 and you surpass it easy (for 3b). Type something like 'Llama 3.2 Requirements' in a search engine. I thought it was Meta's website, but it looks like it is an enthusiast website. I have a Ryzen 5800x and a Rtx 3080 which is worse than you, and run it just fine.
@DigitalSpaceport2 ай бұрын
yes you can run this in windows. Windows does impact performance negatively a bit but you should be easy able to run a 22b or smaller model on that card at a decent quant level. I would not go less then 4 myself.
@hookflash6992 ай бұрын
Nice vid! What UI are you using?
@jakohema84292 ай бұрын
@@hookflash699 openweb ui
@thepulpfloyd2 ай бұрын
He's using Open WebUI 👍🏻
@DavidVincentSSM2 ай бұрын
i agree that the results don't seem to match benchmarks in real world performance.. maybe something everyone else is missing?
@DigitalSpaceport2 ай бұрын
@@DavidVincentSSM Im not trying to larp as a pro or anything but I am interested in what makes for a good product. Im thinking less and less benchmarks make for a good product.
@zurgmuckerberg2 ай бұрын
will you test the 11b one?
@DigitalSpaceport2 ай бұрын
okay
@junkzencka2 ай бұрын
Models answer =: "Here's how some letters relate to numbers using ASCII" M = 77 S = 83 Z = 90 Answer is correct when using the ASCII table. It even tells you what it's doing! Look up the ASCII table
@fork407Ай бұрын
no, because it was given the A=0 in the context.
@michaelwpannekoek2 ай бұрын
What *top program are you running under WSL2?
@DigitalSpaceport2 ай бұрын
htop and nvtop but not sure nvtop runs under wsl2
@DeepThinker1932 ай бұрын
This is the result of AI inbreeding aka training on Synthetic data. I have a prompt that gets counting etc consistently accurate on llama 3.1 8b. However, on the 3.2 models they get things wrong all the time.
@DigitalSpaceport2 ай бұрын
Oh that's a great term! Ai inbreeding 😅
@mayankmaurya86312 ай бұрын
You sure it wasn't because of 3b vs 8b ?
@DeepThinker1932 ай бұрын
@@mayankmaurya8631 Nope, tried my prompts on the 3.2 11b and 3.2 90b as well. They's just inferior and keep getting things wrong. I get consistently correct responses from llama 3.1 8b using my special prompts.
@bbertram22 ай бұрын
It got it right for me... (or so I thought, lol) If A is equal to number 0, what is the number of M, S and z? llama3.2:latest Based on the standard ordering of the alphabet, where A is indeed equal to 0: M = 13 S = 19 Z = 26 Let me know if you have any further questions!
@DigitalSpaceport2 ай бұрын
If A=0, then B=1...M=12...S=18...z=25
@bbertram22 ай бұрын
@@DigitalSpaceport hahahaha....yeah. Oh well.
@DigitalSpaceport2 ай бұрын
I was surprised at a few other models that did get this right and it tracks closely also to counting letters question also I've noticed.
@bbertram22 ай бұрын
I went back and asked it again, same answer. Tried 3.1, could not get it either. However I gave it one clue A=0 and B=1 and boom it got it. Probably too easy of a clue but I'm surprised it could not answer it. I asked Claude and ChatGPT...they couldn't get it either....very odd. Good question!
@DigitalSpaceport2 ай бұрын
Qwen 2.5 here - kzbin.info/www/bejne/moDVmHykZ9-nrLssi=mb33EAbMjXk55YC3&t=555
@_s.i.s.u.2 ай бұрын
You're prompting the model wrong. The "strawberry" tests fail due to the tokenization methods of the given model. Prompt it as if you wanted to place the sentence into an array, then ask your third word second letter. It won't fail.
@DigitalSpaceport2 ай бұрын
@@_s.i.s.u. Other models I've tested, including qwen 2.5, can and do nail that exact question. Copy pasted. If a question has to be asked in a specific way to elicit a correct response, that is a failure.
@mcunumberone6132 ай бұрын
Is there any possibility to earn money with this model?