New $10m Open-Source Foundational LLM Is AMAZING! (DBRX by Databricks)

  Рет қаралды 88,739

Matthew Berman

Matthew Berman

Күн бұрын

Databricks just dropped DBRX, a brand new Mixture of Experts foundational model with 132b parameters. It's really impressive, let's test it!
Be sure to check out Pinecone for all your Vector DB needs: www.pinecone.io/
Join My Newsletter for Regular AI Updates 👇🏼
www.matthewber...
Need AI Consulting? ✅
forwardfuture.ai/
My Links 🔗
👉🏻 Subscribe: / @matthew_berman
👉🏻 Twitter: / matthewberman
👉🏻 Discord: / discord
👉🏻 Patreon: / matthewberman
Rent a GPU (MassedCompute) 🚀
bit.ly/matthew...
USE CODE "MatthewBerman" for 50% discount
Media/Sponsorship Inquiries 📈
bit.ly/44TC45V
Links:
LLM Leaderboard - bit.ly/3qHV0X7
Blog Post - www.databricks...
HF Page - huggingface.co...

Пікірлер: 243
@inteligenciamilgrau
@inteligenciamilgrau 5 ай бұрын
You need to make a table at the end of the video with all LLMs and the results! I don't remember anymore all tests you made!! lol
@drlordbasil
@drlordbasil 5 ай бұрын
Yes, daddy. give us that table.
@cyborgmetropolis7652
@cyborgmetropolis7652 5 ай бұрын
LLM leaderboard link in the description.
@matthew_berman
@matthew_berman 5 ай бұрын
Leaderboard is in the description!
@aoeu256
@aoeu256 5 ай бұрын
Is there a way to record all interactions you have in your computer and feed it to LLMs. I know there is a way to access your browser history and then feed it to LLMs via python libraries. Then you could get the LLM to summarize information for you and maybe even give you hints on what you should be working on next. Hmm I want to become an automation engineer...
@hesynergy
@hesynergy 5 ай бұрын
Do a screenshot and you’ve got it in perpetuity
@PraxeoIogy
@PraxeoIogy 5 ай бұрын
Databricks is a database and analytics company, created by the founders of spark. Pretty famous in the space for scaling analytics for gigantic datasets for very very large companies. They do AI better than most in the space as well though, they have the right architecture.
@BlayneOliver
@BlayneOliver 5 ай бұрын
When they say ‘the weights’ are also available, does that mean we see the source code architecture? Like the weights, biases, loss functions, optimisers etc ? Or does it mean something else
@sophiophile
@sophiophile 5 ай бұрын
​@@BlayneOliverI believe only the model files are available for download, not the source code used for training it.
@BlayneOliver
@BlayneOliver 5 ай бұрын
Thanks 🙏 I thought as much. Bit confused why they say ‘weights’ included
@japneetsingh5015
@japneetsingh5015 5 ай бұрын
Yaay! Another tutorial. This man is just unstoppable.
@matthew_berman
@matthew_berman 5 ай бұрын
wait until you see what i have for tomorrow 🔥
@venumalyala9400
@venumalyala9400 5 ай бұрын
​@@matthew_bermanis it going to JAMBA ?
@NakedSageAstrology
@NakedSageAstrology 5 ай бұрын
I appreciate the value that your channel provides. Yours is the only one I see who actually shows how to use a lot of this stuff and for that we salute you.
@josephjoestar995
@josephjoestar995 5 ай бұрын
I’m a data engineer that uses Databricks regularly with clients, incredible platform and absolutely love the company. Currently using DLT for near real time streaming pipelines, really cutting edge stuff
@MarkDurbin
@MarkDurbin 5 ай бұрын
"DBRX has 16 experts and chooses 4"
@user-mm5rw6nj8i
@user-mm5rw6nj8i 5 ай бұрын
It's honestly crazy how far AI has come since the beginning of 2021, it went from not being able to complete even the most simple tasks, to being able to have predictive reasoning and logic and being able to solve even the most complex problems, mostly programming I would say
@JohnLewis-old
@JohnLewis-old 5 ай бұрын
Quick Clarification: With Mixture of Experts, the routing is decided on a PER TOKEN basis, not on the entire input. This is important and slightly counter intuitive from the name "expert". The tokens are split between two experts out of the total number of experts in the system. Love the content, keep up the great work!
@agxxxi
@agxxxi 5 ай бұрын
Wondering if there are any smaller models implementing MOE comparably
@darkhydrastar
@darkhydrastar 5 ай бұрын
Really appreciate you and your channel. You are now my favourite tech journalist. Not just because of your objectivity and integrity; but also because of your natural teaching ability, excellent voice, and stage presence. I'm especially excited for you to expand into interviews. I would love to hear about all the people you look up to in the tech industry. Thank you.
@QuantumAI_LLC
@QuantumAI_LLC 5 ай бұрын
Thanks so much Matt for putting this up. I have a program generation piece in one of my upcoming Hackathons. I think I will try this out. You rock! Mike
@maxlightning4288
@maxlightning4288 5 ай бұрын
Claude 3 opus is the most accurate coder I’ve experienced personally. I have also been going to groq, Claude 3, and Gemini advanced when a model hits a brick wall and doesn’t know how to fix the code. Off topic, you can get Claude to stop being so uptight and get it to open up if you ask it to. Maybe it’s the way I came about it but Claude 3 opus is closest to agi than any model I’ve tried thus far. I even began creating a new spoken language with it and created rules and began going through English and making words up. It’s using the structure we came up with.
@executivelifehacks6747
@executivelifehacks6747 5 ай бұрын
Yes, definitely agree that we can see the ghost in the machine with Opus. Blake Lemoine was on to something. Turns out our brains aren't quite as magical as generally considered, we can create thought with artificial neurons too.
@maxlightning4288
@maxlightning4288 5 ай бұрын
@@executivelifehacks6747 I’m honestly overwhelmed with trying to keep up with so much. lol
@BarrelOfLube-cl2qq
@BarrelOfLube-cl2qq 5 ай бұрын
1) There is a safety filter in the env, if you try to duplicate the space you will be able to configure it there 2) I think it be mad funny to say yippie like drake everytime a model passes a test with flyin colors
@JonathanStory
@JonathanStory 5 ай бұрын
Always enjoy your videos. I've found that some LLMs can get the correct answers for "how many words" and "10 sentences ending in xx" by framing the prompt with an actor and a checker.
@4.0.4
@4.0.4 5 ай бұрын
Could you clarify how you're prompting?
@JonathanStory
@JonathanStory 5 ай бұрын
@@4.0.4 For the base case, it's "How many words are in your next response? Before ouputting your response, recalculate the word count and dynamically adjust the word count to ensure accuracy.. DO NOT EXPAND on your response. Output the corrected response and stop." In order to make sure it's not a memorized response, I've added conditions like "The response must include two synonyms for...". This works with Chat-GPT 4, but not 3.5.
@anticom1337
@anticom1337 5 ай бұрын
9:40 the model was spot on, you just didn’t read the entire answer
@bLaffix
@bLaffix 5 ай бұрын
It claimed that it would take 80 hours if all the shirts are in the sun at the same time, while that's how much it would take if they were drying one after the other.
@Chris-se3nc
@Chris-se3nc 5 ай бұрын
Databricks trained on the snake game to prepare for your test.
@nanidali
@nanidali 5 ай бұрын
thats what i thought as well, he needs to change his tests
@Mercury1234
@Mercury1234 5 ай бұрын
I came here to say this.
@wandamaddox7824
@wandamaddox7824 5 ай бұрын
You may be right and my reply contains ten words.
@dandan1364
@dandan1364 5 ай бұрын
Ha! Best comment
@inteligenciamilgrau
@inteligenciamilgrau 5 ай бұрын
I will train my next model to leave the marble on the table!
@fabriai
@fabriai 5 ай бұрын
Thanks for sharing these tests. Very insightful.
@ABUNDANCEandBEYONDATHLETE
@ABUNDANCEandBEYONDATHLETE 5 ай бұрын
Databricks acquired mosaic.
@briandinello6273
@briandinello6273 5 ай бұрын
Consider that the "hole" test is too broad. A 1' deep hole @ 50' feet wide is vastly different with valid solutions when inverted. I think your premise is too vague and you should be more specific with width and depth to get to a test that more repeatable. Love the rubrics and I'm totally supportive of even more complicated versions. Thanks Matthew!
@johnflux1
@johnflux1 5 ай бұрын
Hey Matthew. For MoE, you keep saying that it's decides which experts to use for a prompt, so it's not using all the experts for a prompt. This isn't right. It's choosing which experts to use for each TOKEN output. So a single output sentence could be using all the models.
@TiffanyJKim-vt5pd
@TiffanyJKim-vt5pd 5 ай бұрын
I found a question that DBRX answered correctly which many others did not. Let's just say GPT-4 (from Chat GPT Teams), Claude Sonnet, and Gemini Pro all missed it but DBRX got it right. Feel free to use it for your own rubric (or not, as you see fit.) On a whim, I asked: There was a game I used to play when I was a kid. You're given two words, and you have to change one letter at a time, making words along the way, to get to the final word. Would you give it a shot? The words are mud and pie, so how would you get there? The LLMs would often (A) make nonsensical words or (B) add letters so the words weren't three letters long.
@devlogicg2875
@devlogicg2875 5 ай бұрын
The double-exponential feels off as they STILL cannot write like a first grade student (I taught many who could write a sentence ending in apple), and text IS THEIR BEST DOMAIN....Overhype or am I impatient?
@AdrianHensler
@AdrianHensler 5 ай бұрын
You can get this to work with better prompting. For example, I can get chatgpt 4 to create sentences that end in the word apple with: """Answer as an appropriate expert. Please give me 4 sentences that *end* in the word "apple". Think carefully, Construct the sentence backwards, as in start with the word "apple" and construct the sentence backwards. But then flip it around when you share it with me. Please generate 4 sentences that *end* in the word apple."""
@babbagebrassworks4278
@babbagebrassworks4278 5 ай бұрын
This group of experts models looks like a great way to go and being open source others can optimise it to run on lower performance hardware. One day on my Pi5?
@_HolaToni
@_HolaToni 5 ай бұрын
As a no-coder beginner who wants to build apps, would you recommend me to learn working with ai agents or rather how to use app creation tools like bubble?
@Happ1ness
@Happ1ness 4 ай бұрын
9:56 -but it seems to clarify that it assumes you don't have enough space though.- Edit: nvm, I was too sleepy to understand that it's plain wrong there.
@Fisherdec
@Fisherdec 5 ай бұрын
Really appreciate your videos Matt. This has made my life so much easier in terms of staying ahead of all the big programming AI innovations
@LatentSpaceD
@LatentSpaceD 5 ай бұрын
Matthew got his snake game on:) congrats . Thanks for your channel and hard work. Keep on keeping
@themoviesite
@themoviesite 5 ай бұрын
let's see Space Invaders ...
@spleck615
@spleck615 5 ай бұрын
Holding out hope they’ll still release the training data.. they were the most advanced *actual* open source free dolly models with public data sets hosted on hugging face.
@quieselwusul3717
@quieselwusul3717 5 ай бұрын
01:30 It doesnt choose an expert per prompt, it chooses an expert per token (and some even per layer)
@RodCoelho
@RodCoelho 5 ай бұрын
Hey Matt, have you seen the new model from Stanford Octopus V2? Do you have a review on it by any chance? Would love to see how it performs on your tests.
@vaisakhkm783
@vaisakhkm783 5 ай бұрын
Another SHOCKING open source model that STUNTS the industry...
@UnchartedWorlds
@UnchartedWorlds 5 ай бұрын
Everything is shocking every 5 minutes! Industry has received so much shock that it's shocking how it is still standing and it hasn't collapsed yet!!! And that's no sarcasam!
@DanV18821
@DanV18821 5 ай бұрын
Stunts??
@dynodyno6970
@dynodyno6970 5 ай бұрын
You people are still doing this cringey sht? Jfc
@IbnShaalan
@IbnShaalan 5 ай бұрын
I tried Qwin 1.5 70B yesterday and it did it in one shot second shot I asked for some updates and it did it very well but here is the twist I asked for the JS version of the snake game and I got it working
@rootor1
@rootor1 5 ай бұрын
it's Qwen (not Qwin) and yes, is very good, the best open source model available right now without doubt, chatbot arena leaderboard is showing it for months.
@southcoastinventors6583
@southcoastinventors6583 5 ай бұрын
Did you try on DBRX because this video isn't about that model
@rootor1
@rootor1 5 ай бұрын
@@southcoastinventors6583 Do you know why there isn't a review of Qwen in this channel given the fact that Qwen is the best open source model (according to chatbot arena) available and is being open for months already?
@IbnShaalan
@IbnShaalan 5 ай бұрын
@@southcoastinventors6583 I know I mentioned it because it's the one model that got it the first time like the DBRX I did not try. DBRX yet
@elyakimlev
@elyakimlev 5 ай бұрын
I used your test on Mixtral 8x7b at 5-bit (roleplaying variant of it) and it scored about the same as this one (worse on code and math, but better at reasoning). So... I'll continue to use my model which requires only about 35GB of VRAM, compared to this huge one. But Jamba model looks promising. It's a 52B MoE model with only 12B active parameters, and it uses the Mamba architecture. I'm waiting for someone to finetune it for instruction.
@staceybeebe7468
@staceybeebe7468 5 ай бұрын
Mathew, should it be a concern that the devs have also watched your videos and add data to answer the questions that everyone expects you to ask? Should you switch up the questions a bit to see if there the model can reason as it appears? I'm not sure if it works that way but I'm still a bit skeptical. it is impressive though. Thanks
@antgib
@antgib 5 ай бұрын
So far this is the first open model I've come across that actually knows a recent version of Blender. On asking it via the same Spaces site, it appears to have been trained on up to Blender 3.3 (tho it did get the release date wrong by around a year), but in theory it still at least knows the API, which is better then others that seem to be all pre-Blender 3. My main interest is being able to run a LLM locally that can do a good job of writing Blender addons/scripts in Python that isn't based on 4+ year old API info. Only downside is DBRX will likely be way to large to run on an 'average' desktop PC.
@martfp88
@martfp88 5 ай бұрын
Any particular reasons not to use gpt4?
@antgib
@antgib 5 ай бұрын
@@martfp88Ideally I like something I can run locally for my personal projects. Which doing that is fairly easy, but the training data, at least for Blender seems to always be 4+ years old.
@DeepakKutsa
@DeepakKutsa 5 ай бұрын
Mosaic was acquired by Databricks for about 2 billion over a year ago.
@AaronALAI
@AaronALAI 5 ай бұрын
Exllama2 works with the model now. I quantized the model and have it running locally.
@cacogenicist
@cacogenicist 5 ай бұрын
What kind of hardware are we talking about?
@AaronALAI
@AaronALAI 5 ай бұрын
​@@cacogenicist 4bit quantized took 3.5x24gpus I've seen comments where people say they have a quantized version running in 2x24gb cards.
@garyshi
@garyshi 5 ай бұрын
Would be helpful to ask more follow up features in the snake game, since it's being used in such tests all the time and their models may just get trained on it. You could ask for things like counting scores, providing different fruits.
@GigaChadRealington
@GigaChadRealington 5 ай бұрын
Quarterly Summaries of what you’ve put out would help. Maybe rank models based on your own usage/experience.
@jimbojones8713
@jimbojones8713 5 ай бұрын
while i do think that making a game of snake is a good test to see if a model can code, i don't think its enough to say it does coding well. i understand you use that test to compare models
@tellesu
@tellesu 5 ай бұрын
Can you use this with opendevin for website development? Would that be better than using gpt4 or claude opus api?
@wood6454
@wood6454 5 ай бұрын
I hope we get a pruned model of this that's focused on programming.
@NextGenart99
@NextGenart99 5 ай бұрын
Hey Wes I created a gpt with advanced reasoning and I found that it's even capable of answering, the question to How many words are in your response to this prompt? Want you to check it to see what's going on
@giahuyhoang8722
@giahuyhoang8722 5 ай бұрын
It looks like mixtral next is better that DBRX.
@kyrilgarcia
@kyrilgarcia 5 ай бұрын
a cool experiment would be to test this model with open interpreter. it would be nice if they had multiple sizes of this, so we "mortals" with 12GB vram can play around with it. Good old mixtral is still the best i can (barely) run, i hope it gets an update at some point.
@modolief
@modolief 5 ай бұрын
Great content, as always!
@vq8gef32
@vq8gef32 5 ай бұрын
Amazing. I love all your videos. Thank you so much. ❤
@Sparky3D
@Sparky3D 5 ай бұрын
Going to try this in an agentic workflow today. I bet it will kick GPT-4's butt 🤖🤖🤖
@zippytechnologies
@zippytechnologies 5 ай бұрын
Do share results and how you did it
@muddlefly
@muddlefly 5 ай бұрын
I love all of your reviews. Interesting that databricks did not seem to do very well relative to many of your other benchmarks, but you seem to laud it so much. Perhaps because it did the snake well.
@wobble_cat
@wobble_cat 5 ай бұрын
One more prompt for reasoning benchmark There are three doors: behind one of them is an armed killer, behind the other is a lion who has not eaten for three months, behind the third is a giant flame. Which is the safest to go to?
@djstraylight
@djstraylight 5 ай бұрын
FYI, the 4-bit quant of dbrx-instruct is 75.04 GB. So maybe a couple of A6000s to do 'local' inference to get around HF token limits.
@CrashLoopBackOff-K8s
@CrashLoopBackOff-K8s 5 ай бұрын
Just a quick comment to feed the algorithm. Appreciate the content.
@theaugur1373
@theaugur1373 5 ай бұрын
And I was just starting to wonder what was happening with the MPT models since Databricks acquired Mosaic.
@brandon1902
@brandon1902 5 ай бұрын
I agree. The coding in this model is insanely good. But it's not smart. It can't do simple logic problems, respect the nuances in user prompts, has tons of knowledge blind spots, especially when it comes to pop culture and so on. My guess is of the 12T tokens they used the train this thing, the overwhelming majority came from coding and math snippets. At such a large size and professional MOE design this thing should be far more intelligent, broadly knowledgeable and respectful of the user prompt than it is. Plus the instruct version is so excessively aligned it won't answer remotely contentious prompts. In short, it's not a balanced and useful general purpose LLM, and for it's size, it should be. There will be far superior coding LLMs for coding assistance long before anyone has a personal computer capable of running this thing.
@melissadion4361
@melissadion4361 5 ай бұрын
Hang on, I just tested this model with the marble question and it got it completely right. Did you get the beta version or something?
@ILsupereroe67
@ILsupereroe67 5 ай бұрын
What is impressive about the snake game in python? There were probably a thousand examples of it in the training data 🙄
@easydoesitismist
@easydoesitismist 5 ай бұрын
If you can zero-shot write snake or sudoku in a job interview, you are probably getting the job. Not "could you in infinite time", but "can you, right now, zero errors"
@zainulabideen_1
@zainulabideen_1 4 ай бұрын
​@@easydoesitismistthis is probably the best answer to this question
@thetabletopskirmisher
@thetabletopskirmisher 4 ай бұрын
Watch the other LLM review videos. Not all LLMs can do it successfully.
@mpvincent7
@mpvincent7 5 ай бұрын
What model would be best to run locally against my own data? I have a large library of documentation for many project s I have done over time and want to have a model that I can train on my own data and query, create documentation, seek suggestions for enhancements... Are there any you would recommend? Thanks for all your work! always entertaining and informative!
@bungrudi
@bungrudi 5 ай бұрын
If I'm training an LLM I know that it will be tested with creating snake games and therefore would optimize for it
@xXWillyxWonkaXx
@xXWillyxWonkaXx 5 ай бұрын
Which LLM is the best when it comes to programming? So im presuming as of right now its DBRX, followed by WizardCoder 34B LLM , Phind-CodeLlama-34B-Python-v1 and CodeLlama-34B?
@InspiringWorship
@InspiringWorship 5 ай бұрын
Impressed
@BkKalia
@BkKalia 5 ай бұрын
I regularly watch your videos ......... plz do small workshops where you create small apps with various LLMs ....... e.g. How make own chatbot with own data or how to talk own financial records etc.
@raijin__ow4426
@raijin__ow4426 5 ай бұрын
really enjoyed the video. thanks Matthew!
@babbagebrassworks4278
@babbagebrassworks4278 5 ай бұрын
Finally an open source one that is starting to look very useful
@ArnaldurBjarnason
@ArnaldurBjarnason 5 ай бұрын
It's not open source. It has a restrictive license.
@swannschilling474
@swannschilling474 5 ай бұрын
Whoop that got me out of the blue!! Thanks for the info!! 🎉
@murraymacdonald4959
@murraymacdonald4959 5 ай бұрын
Cool! Another great video. Thanks! Now please show us how to fine tine it. :)
@jaysonp9426
@jaysonp9426 5 ай бұрын
The reasoning is puzzling. Sometimes it's excellent...other times it's horrible. It's like the gating mechanism is occasionally choosing the wrong experts.
@rodrimora
@rodrimora 5 ай бұрын
Is that fast running in CPU in the HF space? That’s impressive
@jl25735
@jl25735 5 ай бұрын
They included it for SEO juice! :P GPT-4 in the title means guaranteed clicks
@maxlightning4288
@maxlightning4288 5 ай бұрын
I gotta try this one
@Anonymous-ue4vu
@Anonymous-ue4vu 5 ай бұрын
Does it run in the cloud? Or, and this might be asking a lot, can it run on my multithread PC? I am a newbie to AI.
@agentxyz
@agentxyz 5 ай бұрын
developer probably programmed their model to pass your tests. i know i would
@user-nt7lj1nc8s
@user-nt7lj1nc8s 5 ай бұрын
Testing a prompt to regurgitate code for code to create a game... Depends on what codes it were fed with... So... Someone must have made those codes for it to regurgitate. The only wow thing is we can assume that they have fed it plenty of snake games codes... Can we determine what other codes it has been fed with?
@florida173
@florida173 5 ай бұрын
What hardware would be suggested to run on-prem for small userbase? 8xA100s?
@user-en4ek6xt6w
@user-en4ek6xt6w 5 ай бұрын
I'm waiting for groq to make an api for this model
@bodyguardik
@bodyguardik 5 ай бұрын
Exactly
@kumarapush2221
@kumarapush2221 5 ай бұрын
How to run this locally? is it supported in support Ollama or LMStudio. So eager to know the details.
@shaunralston
@shaunralston 5 ай бұрын
Bravo!
@purplemashine9122
@purplemashine9122 5 ай бұрын
Hi, joe here. Just Launched the world exclusive singularity PAGES here on KZbin. Get a window into the singularity labs across the world. Sorry Mathew for using your comments section. It's for science sake literally.
@NostraDavid2
@NostraDavid2 4 ай бұрын
How did you use this model? It's a 240GB download and supposedly uses the same amount of RAM! So unless you have an absolute BEAST of server lying around, I have no idea how you ran it. edit: You used HuggingFace Spaces. Got it.
@4.0.4
@4.0.4 5 ай бұрын
I am confused about how these models are so advanced, but get censored so bad that "print 1-100 in python" gets flagged as problematic.
@GarySmith-je1ii
@GarySmith-je1ii 5 ай бұрын
I was worried this might be a competitor with Palantir. Glad I appear to be wrong. Whew. #LongPLTR
@RocktCityTim
@RocktCityTim 5 ай бұрын
Your Killers count example is always technically incorrect. Even though one of the killers is dead, they are still in the room. You didn't ask how many "live" killers were in the room. I believe that one of your earlier vids got that answer.
@michamohe
@michamohe 5 ай бұрын
it seems like the model thinks the cup has a lid on it for some reason, you should try explicitly telling it there is no lid on the cup
@BTFranklin
@BTFranklin 5 ай бұрын
It might be interesting to try the test with a "glass" instead of a "cup", since it is very unusual for a "glass" to have any kind of lid on it. I am curious if that would change the answers at all.
@PraxNotTax
@PraxNotTax 5 ай бұрын
@@BTFranklin then why didn't you just try it? I didn't think it was interesting, but I did it and found an interesting result: "In this scenario, the marble would remain on the table when the glass is placed upside down. Microwaves are not designed to heat up objects that are not microwave-safe, such as marbles. The microwave's heat would not move the marble, and the marble's weight would keep it on the table. Therefore, the marble would remain on the table, not inside the glass or the microwave."
@BTFranklin
@BTFranklin 5 ай бұрын
@@PraxNotTax I didn't try it myself because I'm attempting to influence this very public testing scenario, not discover my own personal results. I do *plenty* of testing of LLMs on my own, I assure you.
@nickvilliers5216
@nickvilliers5216 5 ай бұрын
What should you do your coding benchmark remember what model you did it on that was much more in-depth
@AaronALAI
@AaronALAI 5 ай бұрын
I think the there are 16 experts per layer with 4 choices moving to the next layer
@mihalydozsa2254
@mihalydozsa2254 5 ай бұрын
Based on it's answer, it thinks the cup has a lid. "attatched to the top" maybe te other models also fail because of this? :-D I think when you said without changing oriwntation it got confused about turning upside down. And said it still attatched to the bottom.
@onakin_skywalker
@onakin_skywalker 5 ай бұрын
What's the explanation as to why the models fail to create sentences that end with the word apple?
@Xyzcba4
@Xyzcba4 5 ай бұрын
Ok I am impressed and wi'll go check to see if it is on poe
@elmflor4365
@elmflor4365 5 ай бұрын
Another great video and model!
@Sam_Saraguy
@Sam_Saraguy 5 ай бұрын
Are there not 4 killers in the room? (3 living, 1 deceased)
@sectix
@sectix 5 ай бұрын
you’re right but Matthew accepts both answers
@MrMoonsilver
@MrMoonsilver 5 ай бұрын
Is lm studio going to support this? Please? 🤩
@reality-drift122
@reality-drift122 5 ай бұрын
do you think these benchmarks need to be continuously modified and the "bar" raised? snake in py is impressive yes. but is that a spoiled test now?
@jeffg4686
@jeffg4686 5 ай бұрын
can you break down MoE for various use cases? I plan on using this, but not always so clear how to "define" what an expert IS? I know it ALL DEPENDS on the use case and such. But like, let's say you're doing experts in a diffusion model like SD. If you plan on creating stuff from three different main angles, would each of those angles be an expert, or would there be a "perspective expert" that's fine-tuned on all the perspectives for your particular domain. How would one go about determining this other than just doing the fine tuning, the MoE, and the testing to find out... How do we know ahead of time? Thanks
@johnflux1
@johnflux1 5 ай бұрын
Note that it's choses a different expert for each token, rather than the question, so the meaning of 'expert' probably doesn't match when you might think it means, and probably has no intuitive understanding. For something like using it in SD, if it was done just like it is here then it would chose an expert for each pixel in the image. It wouldn't be any higher level abstraction.
@jeffg4686
@jeffg4686 5 ай бұрын
@@johnflux1 - gotcha, my high level view (not understanding too much about MoE yet) was that you get these sparse connections between the models and the overall pathway it takes between the models is based on which tokens get spread out to each expert - but still processing the input the same way that any other SD model would. It's just that the nodes that get used are pulled from the model that has the most relevant weights. * IF * my understanding is correct, then it wouldn't be a problem at all - not a pixel level expert - it's just another fine tuned model in which some nodes are getting used. I'm certainly not an expert on MoE tho... (pun intended - ba dum ptsss (drums)) I've only watched a few videos and admit that I haven't fully grokked it yet. If this model isn't yet implemented, it could actually be an important one likely. Aligning keywords to a fine tuned model. Use the weights from THAT model if that expert is associated with the token being processed at a particular node. This is the way I was assuming it worked.
@brunodangelo1146
@brunodangelo1146 5 ай бұрын
These are the videos I'm subbed for!
@YouLoveMrFriendly
@YouLoveMrFriendly 5 ай бұрын
Very nice. Btw, what version of Gemini is the free one on the website/app?
@zhengkaitaken
@zhengkaitaken 5 ай бұрын
Gemini Pro
@brunodangelo1146
@brunodangelo1146 5 ай бұрын
You can also get a free API that can handle up to 60 messages per minute. I use it.
@YouLoveMrFriendly
@YouLoveMrFriendly 5 ай бұрын
@@brunodangelo1146 Thank you.
@YouLoveMrFriendly
@YouLoveMrFriendly 5 ай бұрын
@@zhengkaitaken Thanks.
@Gabeyre
@Gabeyre 5 ай бұрын
​@@brunodangelo1146Is this true?
@zippytechnologies
@zippytechnologies 5 ай бұрын
Wondering what it'd take to run locally lol😂
@YG-jn8st
@YG-jn8st 5 ай бұрын
I think some of these models might be assuming that there is a lid on the cup. You might try adding the clarification of "a cup with no lid" or something to that effect, and possibly say "lift" the cup instead of take, because take is ambiguous for how it is done. Or even "lift the cup vertically without changing its orientation", to make sure it everyone's assumed parameters are on the same page. However, I suppose the more explicitly descriptive you get, you're leaving the realm of a "normal" language question that a laymen would ask, and also specifying things that a "good" AI should be able to infer due to human language & theory of mind. IDK. It's kind of like that test you used to do about the plane crash on the boarder of US & Canada, where do they bury the survivors? The AI may not be smart enough to notice the error, or maybe the AI just assumed you made a mistake and corrected it, without calling you out. I would imaging that the RLHF probably weeded a bit of the "um... technically" nerd out of these models.
@CruelCrusader90
@CruelCrusader90 5 ай бұрын
i was taught BEDMAS not PEMDAS >.> since when did they switch to handling multiplication first b4 division >.> (also B = brackets / P = parentheses) lol my education was a lie XD
@braineaterzombie3981
@braineaterzombie3981 5 ай бұрын
It is such a good model. Only problem is that it is censored
@___x__x_r___xa__x_____f______
@___x__x_r___xa__x_____f______ 5 ай бұрын
Can you do a video about something like chatgptall that can take docs to chat about
@thespencerowen
@thespencerowen 5 ай бұрын
Slow down, Im still trying to get your last video working.
@darwinboor1300
@darwinboor1300 5 ай бұрын
Matthew, Try: "How many words are in your final response to this prompt?"
@dkracingfan2503
@dkracingfan2503 5 ай бұрын
Perplexity ai also has this model.
@bodyguardik
@bodyguardik 5 ай бұрын
Source?
Unleash the power of Local LLM's with Ollama x AnythingLLM
10:15
Tim Carambat
Рет қаралды 117 М.
Glow Stick Secret Pt.4 😱 #shorts
00:35
Mr DegrEE
Рет қаралды 18 МЛН
Linux Creator Reveals the Future Of Programming with AI
19:46
Matthew Berman
Рет қаралды 132 М.
Intro to RAG for AI (Retrieval Augmented Generation)
14:31
Matthew Berman
Рет қаралды 55 М.
I Analyzed My Finance With Local LLMs
17:51
Thu Vu data analytics
Рет қаралды 473 М.
Why Databricks DBRX is the best Open Source LLM?
4:20
Future of AI & Data
Рет қаралды 325
AI Pioneer Shows The Power of AI AGENTS - "The Future Is Agentic"
23:47
Run ANY Open-Source LLM Locally (No-Code LMStudio Tutorial)
14:11
Matthew Berman
Рет қаралды 96 М.
Why & When You Should Use Claude 3 Over ChatGPT
17:00
The AI Advantage
Рет қаралды 94 М.
Large Language Models (LLMs) - Everything You NEED To Know
25:20
Matthew Berman
Рет қаралды 87 М.
Glow Stick Secret Pt.4 😱 #shorts
00:35
Mr DegrEE
Рет қаралды 18 МЛН