Llama-3.3: The BEST Opensource LLM EVER! Beats GPT-4o! (Fully Tested)

  Рет қаралды 11,381

WorldofAI

WorldofAI

Күн бұрын

Пікірлер: 53
@intheworldofai
@intheworldofai Ай бұрын
Want to HIRE us to implement AI into your Business or Workflow? Fill out this work form: td730kenue7.typeform.com/to/WndMD5l7 💗 Thank you so much for watching guys! I would highly appreciate it if you subscribe (turn on notifcation bell), like, and comment what else you want to see! 📆 Book a 1-On-1 Consulting Call WIth Me: calendly.com/worldzofai/ai-consulting-call-1 🔥 Become a Patron (Private Discord): patreon.com/WorldofAi 🧠 Follow me on Twitter: twitter.com/intheworldofai Love y'all and have an amazing day fellas. Thank you so much guys! Love yall!
@lancemarchetti8673
@lancemarchetti8673 Ай бұрын
I used 3.3 in glhf last night to build a qrcode storage concept..After 4 hours I was so excited because the app actually worked. 3.3 was particularly good at figuring out why my qrcode outputs from the design grid were not passing qrcode scanning tests. It tuned out that there was a faint grey 1px outline bordering each cell that was exported to the respective image format. I've noticed that keeping explanations simple in the prompting is key. Also, just paste issues from the console without explaining. The llm is pretty sharp at figuring out the issue. Overall...I'm impressed.
@AaronBlox-h2t
@AaronBlox-h2t Ай бұрын
a qrcode storage concept?
@intheworldofai
@intheworldofai Ай бұрын
[Must Watch]: Qwen-2.5: The BEST Opensource LLM EVER! (Beats Llama 3.1-405B + On Par With GPT-4o): kzbin.info/www/bejne/r5WTnJp6rNCZsJIsi=Uh2eCpIWYpcY54Hq DeepSeek-v2.5: BEST Opensource LLM! (Beats Claude, GPT-4o, & Gemini) - Full Test: kzbin.info/www/bejne/o6fTnI1nrqusbdEsi=NR9ChO50-HKJW9Cb Bolt.New + Ollama: AI Coding Agent BEATS v0, Cursor, Bolt.New, & Cline! - 100% Local + FREE!: kzbin.info/www/bejne/kKDSoJ2Mab93g9k
@intheworldofai
@intheworldofai Ай бұрын
Warp: AI Terminal - FREE AI Coding Agent BEATS Cursor, V0 & Bolt! (Generate Full-Stack Apps!): kzbin.info/www/bejne/q4PUpnhme8-pfM0
@intheworldofai
@intheworldofai Ай бұрын
Gemini 2.0 Flash: BEST LLM Ever! Beats Claude 3.5 Sonnet + o1! (Fully Tested): kzbin.info/www/bejne/onanYayCp7qYoq8
@jackinsights
@jackinsights Ай бұрын
Great analysis mate.
@intheworldofai
@intheworldofai Ай бұрын
Thank you so much!
@MatthewSanders-l7k
@MatthewSanders-l7k Ай бұрын
Love the enthusiasm for Llama 3.3! I'm definitely trying it out, cheaper and better than GP4 Omni is a win-win for me. Huge props to Meta AI team!
@jeffwads
@jeffwads Ай бұрын
LM Studio has their own version of the model. Tested the 8bit and it is fantastic. Refactored a large 1200 line class into 650 lines including notes. Only a few minor adjustments and it worked beautifully.
@SahilP2648
@SahilP2648 Ай бұрын
Squashing lines of codes doesn't mean it's more performant or even cleaner. I am not sure why you mentioned only lines of code and nothing else.
@HansKonrad-ln1cg
@HansKonrad-ln1cg Ай бұрын
i have a 1 gallon jug, a 10 gallon jug, a 170 gallon jug and a 200 gallon jug. i want to measure exactly 173 gallons. how do i do that? all the models i have tested have failed this. the solution is obviously to pour a full 170 gallon jug into the 200 gallon one and then add a full 1 gallon jug three times to that. all the models either overthink it hopelessly or dont even add the right amount or want to fill jugs more than they can hold, etc. pretty sobering experience in case you got too excited about agi and stuff.
@alinayossimouse
@alinayossimouse Ай бұрын
how is that butterfly a pass? Maybe you should ask Llama 3.3 how many wings a butterfly has
@pr0d1gyvisions74
@pr0d1gyvisions74 Ай бұрын
That butterfly looked like Mr Hankey with wings. 🤣🤣
@intheworldofai
@intheworldofai Ай бұрын
HAHAHA south park ref
@AlexJohnson-g4n
@AlexJohnson-g4n Ай бұрын
Llama 3.3 sounds like a beast! Can't wait to try it out and see how it compares to GP4 Omni. Props to Meta AI team for making AI more accessible
@julienarpin5745
@julienarpin5745 Ай бұрын
It performs no better than Llama 3.1 on my private benchmark of verbal intelligence based on years of items encountered in the IQ testing community. My benchmark has matched prevailing benchmark leaderboards for years but is now diverging, indicating that these LLMs are gaining skillsets but not true intelligence, which is the ability to perform work on arbitrarily distant symbols in concept space, relations modelling, or sensitivity to analogies.
@marcus-b4x3h
@marcus-b4x3h Ай бұрын
Make a video ❤
@8python8
@8python8 Ай бұрын
they are no more than freaking parrots if something is in their training set, you get a reasonable answer otherwise it's either BS either FOS, depends how deep it hallucinates no creativity whatsoever, just parroting its training set like a 5 yo with a huge SSD
@julienarpin5745
@julienarpin5745 Ай бұрын
@pythonate2906 Not exactly true since cranking the temperature up changes some function that enables novelty, which is arguably creativity as a whole. It's all about how you traverse the latent space with prompting and fine-tuning
@threepe0
@threepe0 Ай бұрын
@@8python8you are no more than a parrot. How you use agents and teams is limited only by your imagination.
@8python8
@8python8 Ай бұрын
​@@threepe0I guess stating the truth is disturbing. Not just me saying these things. But I wouldn't waste my time arguing. If something is NOT in the training set, parrots won't come up with it. I am testing NEW things, never been tested before, unlike your trivial daily BS.
@xierxu
@xierxu 23 сағат бұрын
Are you running it locally?
@ABUNDANCEandBEYONDATHLETE
@ABUNDANCEandBEYONDATHLETE Ай бұрын
Idk if i downloaded a bad version but my 4090 liquid, 3970x w/256GB RAM is only pumping out like 0.5-2 tokens a sec. Thats not functional to even benchmark if its too slow. The qwen2.5 32B coder is like 30-60 tokens a sec.
@lmaoanatorhilarious2582
@lmaoanatorhilarious2582 Ай бұрын
Does this have reasoning
@intheworldofai
@intheworldofai Ай бұрын
Yup. Logical, analogical, and common sense reasoning
@tomgreen8246
@tomgreen8246 Ай бұрын
Sort of. Not in the o1 Preview sense... let alone o1 Pro (oh. my. God). Yes I dropped the money for it.
@intheworldofai
@intheworldofai Ай бұрын
@@tomgreen8246 wow let me know how the tests go fro the o1 pro!
@lmaoanatorhilarious2582
@lmaoanatorhilarious2582 Ай бұрын
@@tomgreen8246 yeah I want that o1 reasoning. How do you like o1 pro? How does it differentiate from o1 preview
@UCs6ktlulE5BEeb3vBBOu6DQ
@UCs6ktlulE5BEeb3vBBOu6DQ Ай бұрын
QWEN2.5 72B and QwQ both ace the train question (tried it while watching this)
@intheworldofai
@intheworldofai Ай бұрын
QwQ and the new deepseek reasoning model is quite impressive. Great alternative to o1
@UCs6ktlulE5BEeb3vBBOu6DQ
@UCs6ktlulE5BEeb3vBBOu6DQ Ай бұрын
@@intheworldofai QwQ is better at coding, 72B is better at creating the code. If one could, he'd get 72B to create the program structure and tell QwQ to code it.
@ABUNDANCEandBEYONDATHLETE
@ABUNDANCEandBEYONDATHLETE Ай бұрын
What lm-studio settings for this model should I use? The Q4_K_S is so slow for my machine. Loading 131072 context is laughingly slow. 4090 with 256GB and 3970X 32core AMD. Other models have no issues.
@intheworldofai
@intheworldofai Ай бұрын
@@ABUNDANCEandBEYONDATHLETE Try decreasing the batch size to a smaller valuery setting. Plus you can also set model parallelism to False
@UCs6ktlulE5BEeb3vBBOu6DQ
@UCs6ktlulE5BEeb3vBBOu6DQ Ай бұрын
@@ABUNDANCEandBEYONDATHLETE you are running a model that is slow on gpu with vram speed under 600gb/s on a system that offload too much load on memory that operate under 200gb/s. You are paying the poverty tax with your time. 72B at Q5 runs at 5-6 token/s for me. Its slow but at least I do not offload anything. Get yourself some other gpus and few PCI-e risers and increase your total vram. That system of yours is overkill for a server but its almost useless for LLM because of memory bandwith. Sorry
@HikaruAkitsuki
@HikaruAkitsuki Ай бұрын
They basically downsized 405B to 72B?
@intheworldofai
@intheworldofai Ай бұрын
Yes, it's a smaller model, but it's still incredibly powerful and efficient.
@TheVisionaryX
@TheVisionaryX Ай бұрын
@relexelumna5360
@relexelumna5360 29 күн бұрын
Knowledge limited to Dec 2023 🥴
@prashantbhardwaj6322
@prashantbhardwaj6322 Ай бұрын
GPT 4o-MINI is $0.15/M tokens, how is this 25 times cheaper?
@intheworldofai
@intheworldofai Ай бұрын
I said gpt-4o. I never mentioned gpt-4o mini
@prashantbhardwaj6322
@prashantbhardwaj6322 Ай бұрын
@@intheworldofai at 00:30 mark, maybe I heard it wrong.
@intheworldofai
@intheworldofai Ай бұрын
@@prashantbhardwaj6322sorry brotha I think you heard wrong. I said “25x cheaper than gpt-4o” not gpt-4o mini. There two different models.
@words13100
@words13100 18 күн бұрын
that's not a butterfly. 👎 fail.
Best of CES 2025
14:50
The Verge
Рет қаралды 197 М.
Generative AI in a Nutshell - how to survive and thrive in the age of AI
17:57
БОЙКАЛАР| bayGUYS | 27 шығарылым
28:49
bayGUYS
Рет қаралды 1,1 МЛН
This Video is AI Generated! SORA Review
16:41
Marques Brownlee
Рет қаралды 3,8 МЛН
I reimplemented REACT SERVER COMPONENTS in ASSEMBLY
20:35
Neo Goose
Рет қаралды 6 М.
This Llama 3 is powerful and uncensored, let’s run it
14:58
David Ondrej
Рет қаралды 185 М.
Llama 3.3 70B in 5 Minutes
4:59
Developers Digest
Рет қаралды 43 М.
How AI Will Impact Software Engineers.
18:25
Levi Okoye
Рет қаралды 7 М.
Llama 3.2 VISION Tested - Shockingly Censored! 🤬
7:58
Matthew Berman
Рет қаралды 66 М.
Transformers (how LLMs work) explained visually | DL5
27:14
3Blue1Brown
Рет қаралды 4,3 МЛН
Devin just came to take your software job… will code for $8/hr
5:13
КОРОЧЕ ГОВОРЯ, НЕДЕЛЯ БЕЗ ТЕЛЕФОНА
3:54
The Million view clip on China's Tiktok P2428 #shorts #gochannel
0:15
Go Channel TV
Рет қаралды 29 МЛН
ЛАЙФХАК НА КУХНЕ ! 🧐🤦🏻‍♂️ #shorts #лайфхак
0:15
Крус Костилио
Рет қаралды 109 М.
Её автомобиль никто не хотел ремонтировать!
20:12
Гараж Автоэлектрика
Рет қаралды 1,5 МЛН