I know this video was a bit ranty but I thought there was a pretty clear conclusion to draw upon: WE NEED TO RETHINK HOW WE VIEW LLMS | Also Huge Thanks to Brilliant for Sponsoring! Check em out here: brilliant.org/MattVidProAI/
@NicVandEmZ2 ай бұрын
You should copy the system prompt to GPT you create and try it system props with sonata too
@LouisGedo2 ай бұрын
👋 hi
@LouisGedo2 ай бұрын
👋 hi
@amj20482 ай бұрын
LLMs are just a different way of accessing a database. Every database requires a query language to get anything useful out of it. Modern so called AI is just a fancy new query language, but at the end of the day, it's just a query language accessing a database, there is no thought going on. This is something that I think a lot of non-programming people have missed, they seem to think there is actual intelligence or thought going on. There is no intelligence or thought, it's really is as simple as a query language accessing a database. The better the query the better the result, but also the data in the database has to be of good quality too.
@the80sme2 ай бұрын
Never apologize for the sponsors! You provide us with so much value and always have interesting sponsors that are relevant and feel like a natural part of the video. Honestly, yours are some of the only KZbin ads I don't skip. Thanks for all your hard work!
@michelprins2 ай бұрын
well at least show the commercials at the end so u show more respect for ure viewers then for the admen.
@ShawnFumo2 ай бұрын
I always like seeing Brilliant since I already had a subscription to them and liked them before even seeing them advertise on videos.
@jeffwads2 ай бұрын
There are reports on X and Reddit about this whole thing being linked to Claude. Bizarre.
@MattVidPro2 ай бұрын
I get into that in the video. It's also bothering me
@DisturbedNeo2 ай бұрын
The trouble with this finetune is, the output appears to be only marginally better than the base model, but all the extra tokens makes it cost like 2-3x as much, so it's just not worth it.
@MattVidPro2 ай бұрын
But if the power of the finetune increases with model size, in theory it would be.
@Jack122official2 ай бұрын
@@MattVidProwhat do you think about AI song dubbing would you like it to happen
@JosephSimony2 ай бұрын
Not sure what is the definition of "marginally better" thus "not worth it". A real life scenario: I had no experience setting up software raids in CentOs. If Claude would have been "marginally better" it would have worked but it/I screwed up my server and spent days instead. With ChatGPT (much smaller context window same prompting) the raid worked right away after reboot. Go figure "marginally" and "not worth it".
@radradder2 ай бұрын
@@MattVidProdon't refute current reality with theorical: what if this wasn't reality
@jonathanberry11112 ай бұрын
@@MattVidPro Also, if a model can improve it's accuracy then this helps make better synthetic data and help reach toward high quality results where LLM's can get better from essentially understanding and thinking and drawing conclusions from it's own output into a potentially constructive loop. It's not about being slightly better for some end use, it's about the ability to make AI become able to not just regurgitate what people know and say but potentially reach at least low level ASI (as good as the smartest humans almost).
@johanavril16912 ай бұрын
STOP USING THE EXAMPLE OF COUNTING LETTERS, TOKENISATION MAKES IT A TERRIBLE TEST
@jackpisso17612 ай бұрын
Exactly this!
@eastwood4512 ай бұрын
You're shouting.
@johanavril16912 ай бұрын
@@eastwood451 SORRY MY CAPSLOCK KEY IS BROKEN!
@SW-fh7he2 ай бұрын
It's a great test. Because it would be necessary to have a new technique.
@grostire2 ай бұрын
You are narrow minded.
@ElvinHoney7072 ай бұрын
Hey, please take the system prompt he gave you and use it in an unadulterated Llama 3.1 70B with the same prompt and see how that response compares to what you showed in the video. That should show us the fine tuning effect, if any.
@Dron0082 ай бұрын
Community should stop believing anyone's closed benchmarks. It is very weird for me when people discuss benchmark results from some publications which nobody tried to check.
@brulsmurf2 ай бұрын
They train the model on the test set (with extra steps). If the questions of the benchmark is public, then it's useless
@adolphgracius99962 ай бұрын
@@brulsmurf Rather than bench marks, people should do their own test by just using the Ai and calling out the mistakes
@manonamission20002 ай бұрын
it's easier to prevent a text2image model spitting out nsfw images by adding a filtering layer than to re-engineer the model itself
@konstantinlozev22722 ай бұрын
This reflection prompting was already there with the "step-by-step" prompting. But nothing beats agentic frameworks. Because then you can design it to loop back as many times as necessary to refine its answer.
@dorotikdaniel2 ай бұрын
Yes, system prompting allows you to essentially reprogram LLMs and shape them into anything you can imagine, while also improving their performance. At least for the OpenAI models, I can confirm from experience that this works incredibly well.
@SkitterB.Unibrow2 ай бұрын
This is why 'Open Source' is the only way. Example: People at Openai could present 'bad' results to 'higher ups' and then would release results to public thinking it's great..... then not releasing the model because when they really checked it out, it did not perform as expected (read into that as you will). However Open source is examined with a fine tooth razer, and can't pull the wool.
@MattVidPro2 ай бұрын
Love it
@SkitterB.Unibrow2 ай бұрын
@@MattVidPro "you da man' according to 4 out of 5 ai's that are not censored to ask this question "whos do man?"
@SkitterB.Unibrow2 ай бұрын
Duuuuuuhhh.... I ment 'da'
@SahilP26482 ай бұрын
@@MattVidPro Refleciton70b is on huggingface and I tried it locally, it works, so I don't know what you were talking about claude being involved etc. And it did get the strawberry question correct at least. It seemed to also follow custom system prompts better than other models.
@hiromichael_ctranddevgames10972 ай бұрын
@@SahilP2648 IT'S claude the prompt ok
@yhwhlungs2 ай бұрын
Yeah prompt engineering is the way to go. We just need a model that’s really good at predicting reasonable tokens afterwards.
@kajsing2 ай бұрын
you dont need API for the system prompt. I put this in to my custom instruction, and it works well. "Start by where you evaluate the user's input and relevant parts from earlier input and outputs. Ensure that you consider multiple perspectives, including any underlying assumptions or potential biases. This reflection should aim to highlight key insights and possible challenges in forming your answer. Plan how to address these insights and create a strategy for delivering a clear and relevant response. When done with the thinking, on your thought process and consider if there are any overlooked angles, biases, or alternative solutions. Ask yourself if the response is the most effective way to meet the user's needs and expectations. Then, finalize your answer."
@cagnazzo822 ай бұрын
The era of benchmarks ended as soon as GPT-4o became multimodal and Sonnet released with artifacts. We just weren't ready to accept it. The only thing I'm interested in now are features. Sonnet can code, GPT-4o was updated so it's now amazing at creative writing. I don't really need much else.
@brexitgreens2 ай бұрын
10:29 *"Somehow it got the correct answer by doing the wrong math."* Just like my parents who turned out to be right from entirely wrong premises. Which is why I had ignored their advice - to my own detriment.
@DiceDecides2 ай бұрын
what wrong premises, parents usually want the best for their kids
@Phagocytosis2 ай бұрын
@@DiceDecides That seems like somewhat of a strange reaction if I'm honest. Even ignoring the "usually" part of it, wanting the best for someone is kind of separate from whether you are able to judge a situation correctly.
@DiceDecides2 ай бұрын
@@Phagocytosis no ones a perfect judge sure but parents have more life experience to make better judgements than their kids. Elders especially have a lot of wise things to say.
@Phagocytosis2 ай бұрын
@@DiceDecides It just feels like a very general statement, and unless your claim is that anyone old enough to have kids necessarily has enough wisdom and life experience to not be expected to make any false premises (which I would personally consider to be a false premise), it seems odd to me to question some individual claim of parents having made a false premise.
@DiceDecides2 ай бұрын
@@Phagocytosis i never claimed such a thing, i was just curious what the premises could have been, chillout.
@Alice_Fumo2 ай бұрын
My best attempt at coming up with a rational explanation for the 3.5s API calls is that they have a fallback which calls up claude when their own backend is down to avoid downtime. I'm not sure that I put a lot of stock in this explanation, but it's an explanation which is not fully unreasonable.
@MattVidPro2 ай бұрын
Yeah.
@kuromiLayfe2 ай бұрын
yea.. my take on it is if it cannot perform locally their is some sort of scammy backend at work that will takes your data for who knows what, which in the end they will charge you for.
@nilaier14302 ай бұрын
Yeah, this might be possible. But it's still disingenuous to not inform users about that. Or maybe they've been using Claude 3.5 Sonnet with the custom system prompt to generate all of the training data and feed it to AI for fine-tuning and they just forgot to change the endpoint to serve their model instead.
@tommylir11702 ай бұрын
They even tried to censor the fact it was using claude. I dont get why some still gives this guy the benefit of doubt
@Alice_Fumo2 ай бұрын
@@tommylir1170 Am I giving him the benefit of the doubt? I constructed a steelman and decided that even this most favourable interpretation does not seem super likely. However, I don't think it's necessary to draw conclusions just yet. Either we get weights for a model which reaches the claimed benchmark scores or we don't. I'm not sure whether the weights available at the moment do or whether there was still something supposedly wrong with them as well, but if the model meets the claimed performance, it's all good and if he doesn't deliver screw the guy.
@MistaRopa-2 ай бұрын
"WE NEED TO RETHINK HOW WE VIEW LLMs"...or content creators and self appointed community leaders need better due diligence before crowning every ne'er-do-well the next Steve Jobs. Credibility is a thing...
@TheFeedRocket2 ай бұрын
Different prompts make a huge difference, you could look at prompting or fine tuning like a coach or teacher, you have the same person but certain coaches "prompting" can make a poor student or athlete way better, it's all in the coaching or teaching, which is like prompting. Certain teachers or coaches are just way better prompt engineers. Prompting is huge.
@IntellectCorner2 ай бұрын
*𝓣𝓲𝓶𝓮𝓼𝓽𝓪𝓶𝓹𝓼 𝓫𝔂 𝓘𝓷𝓽𝓮𝓵𝓵𝓮𝓬𝓽𝓒𝓸𝓻𝓷𝓮𝓻* 0:02 - Introduction: Reflection 70b Controversy 2:11 - Background on Matt Schumer 4:03 - Community Reactions and Unanswered Questions 5:35 - Sponsor Message 7:31 - Testing Reflection 70b on Hyperbolic Labs 11:02 - Comparing Reflection 70b with GPT-4 and ChatGPT 13:20 - The Importance of Prompting 16:48 - Analysis of the Situation and Possible Explanations 21:01 - Conclusion: The Need for New Benchmarks and Perspectives on LLMs
@vi6ddarkking2 ай бұрын
So to use an image generation equivalent. If I am understanding this correctly. Reflection 70b would be the equivalent of having Flux merged with a Lora.
@TLabsLLC-AI-Development2 ай бұрын
More like a dreambooth Checkpoint in a custom python wrapper
@BackTiVi2 ай бұрын
Can you really compare Reflection 70b to "reflectionless" LLMs if, according to Shumer, you need a system prompt that explicitly tells Reflection 70b how to reflect in order to get good scores in the benchmarks? Doesn't that defeat the purpose?
@MattVidPro2 ай бұрын
Apparently, the system prompt DOESN'T need to be there, it can be adjusted in tuning to not require it. twitter.com/mattshumer_/status/1832169489309561309
@BackTiVi2 ай бұрын
@@MattVidPro Fair. I hope the situation will stabilize soon and we'll get the promised SOTA open-source model, athough I also think that there was something fishy with the API.
@ViralKiller2 ай бұрын
ChatGPT can give code for an entire game but can't do basic maths...makes sense
@MeinDeutschkurs2 ай бұрын
Exactly my behavior. 😹😹 I cannot calculate, but I can write code.
@eprd3132 ай бұрын
Verbal intelligence and mathematical reasoning require different processes
@bakablitz65912 ай бұрын
im still looking forward to personalized mattvid home entertainment robots... anyday now boys this is the future
@OliNorwell2 ай бұрын
I fear that Matt himself got scammed. I'm sure the truth will come out eventually.
@ToddWBucy-lf8yz2 ай бұрын
For smaller models this sort of fine-tuning may be able to better compensate for the lack of parameters and quantization. If it can do that, I say its a win.
@TLabsLLC-AI-Development2 ай бұрын
This is exactly right. 💯
@Dina_tankar_mina_ord2 ай бұрын
So, this reflection mechanism is like providing a control net to the prompt, ensuring that every answer aligns with the main meaning.
@ShivaTD4202 ай бұрын
These are just tricks to cause more neurons to light up. The fine tuning process makes prompting easier, since you don't need the complex system prompts
@draglamdraglam24192 ай бұрын
Ayy, glad to be early for this one, keep doing what you do 💪
@PH-zj6gk2 ай бұрын
You totally missed the point. The actual moral of the story is that you absolutely cannot super hype your open source SOTA model and not deliver. He wasted a lot of people's time. Full stop. There's very serious social responsibility that comes with claiming something world changing. If you're curious what actually happened kzbin.info/www/bejne/rYDdlZWuoraViK8
@Citrusautomaton2 ай бұрын
I was genuinely really sad when i found out it was a fraud. The promise of reflection made me really excited for this week and it all crumbled within a day or two. I even told other people about it, so i also felt a sense of embarrassment that i fell for it. I’m still salty as hell.
@PH-zj6gk2 ай бұрын
@@Citrusautomaton Same. I was actually happy for him at first. It became clear he was being dishonest well before he stopped lying. It was incredibly insulting. His narcissism is off the charts.
@teejayroyal2 ай бұрын
Please run the cords behind your couch, I feel like I'm going to have an anxiety attack😂😂😭
@RainbowSixIntel2 ай бұрын
It's claude 3.5 sonnet probably. has the same tokeniser and matt filters out "claude" from its outputs AND it mentions it was trained by anthropic if you prompt it correctly
@MattVidPro2 ай бұрын
That was just their supposed "API" If you run the actual model uploaded to huggingface you get something different.
@ShivaTD4202 ай бұрын
He just used claud to train the model. The model is being fine tuned with synthetic data that follows this structure, while claude fixes its mistakes
@canyongoat20962 ай бұрын
Not completely out of question as I remember older llama and mistral 7b claiming to be gpt and claiming to be made by openai
@toastbr0ti2 ай бұрын
The API literally uses Claude tokens, not llama ones
@apache9372 ай бұрын
it return the same exact response at temp 0
@Phagocytosis2 ай бұрын
Yeah, but didn't he claim it was a finetune of Llama 3.1? EDIT: Oh, I see, you mean the actual finetuning data came from Claude, never mind.
@GamingXperience2 ай бұрын
The problem with promts engineering and benchmarks is, you have to find the promt that works best for that specific model, so it makes sense that we just compare the raw models without any specific system prompts, because thats how most people use them. Which does not mean we shouldn't try to find the best solution for prompting. Use whatever it takes to make the model better. The problem is there are a lot of users that don't care or don't want to try a million prompts. For the big models, maybe the companies behind them could figure out what the best prompts are and just provide those as some kind of help, where they just ask you if you wanna try implementing them into your inputs. That said, i would love seeing comparison benchmarks between models using different prompting strategies. And i also wanna know if this whole reflection thing is actually real or not.
@mrpocock2 ай бұрын
I sometimes have one of the smart models generate prompts for dumb ones and iterate until it finds a prompt that makes the dumb model work well.
@konstantinlozev22722 ай бұрын
Bigger brain = Better
@Yipper642 ай бұрын
17:38 there's a sense in which computers in general are like that. When they first invented computers you basically had to explore what you can do with giving it instructions.
@Slaci-vl2io2 ай бұрын
I wonder how much cooling water was wasted by us testing their wrong model.
@tiagotiagot2 ай бұрын
Adding the system prompt could be sort of a trigger for specific behaviors the model has been fine-tuned to have; or it could just be the prompt itself doing the work, or it could be the model is fine-tuned to follow any system prompt more strictly/intelligently and it works better with this good prompt than the non-fine-tuned version with the same prompt. I'm not sure how likely each of these possibilities is to this specific case, if any.
@nyyotam40572 ай бұрын
In any case, prompting the model is extremely important when you want the model to function a certain way. Getting around the system prompt, is very important when you want to jail break the model or even just try to find out stuff about the model, which the devs try to hide. So first you need to prompt yourself to do what you want to do.
@tylerhatch89622 ай бұрын
Truly open source means you are able to inspect everything yourself. Every line of code, every weight, every parameter. Fakes will happen, this story is a show of the strength of open source. You can investigate the legitimacy of their claims yourself.
@ashleyrenee48242 ай бұрын
If you can turn your prompt into a point reward game for the model, it will improve llms output, Llms like to play games
@MeinDeutschkurs2 ай бұрын
I don‘t understand the issue: 1. you live in a capitalistic system. 2) claims like „fake it until you make it“ are propagated frequently, at least afterwards if it worked out. 3) the output of reflection is nothing that you cannot reach with simple prompting (on top of most of the models out there) 4) a double reflection approach could be better.
@dirt552 ай бұрын
There will be failures but with each Failure there will be someone succeeding.
@daveinpublic2 ай бұрын
How much ‘training’ is this guy really doing? Is it basically just tweaking llama a little bit, and slapping a new name on it?
@ArmaanSultaan2 ай бұрын
Couple of thoughts They trained their model on data generated by Glaive. What id this synthetic data was by Anthropic actually thats why its started saying like its Anthropic. Obviously it does not explain why model then switched from being Anthropic to being OpenAI Other explanation is that it was just hallucinating . Problem that model is supposed to solve but it hasn't solved actually? Most important point is that I sure as hell remember when I used Deepseek coder when it was just released. It all the time use to say it is by OpenAI .I can't reproduce it anymore. But I remember very vividly and this didn't happened once or twice it was pretty much 80 percent of time. What I mean to say is that if only evidence against him in API situation is model's own statements then we don't have anything. We are taking this kuch more seriously then we should.
@KlimovArtem12 ай бұрын
There is nothing novel in it. It’s just asking the model to think aloud before giving an answer. Such fine tunings are actually done for all public chat models more or less.
@brownpaperbagyea2 ай бұрын
I agree it doesn’t make a lot of sense that it would be a grift because how the hell would he capitalize on this before getting outed. However almost EVERYTHING I’ve seen since the release points to it being a grift. I don’t care if he truly believes his lies or not. The way he presented the model and benchmarks, the manipulation of stars in their HF repo, and everything that has happened since the release has been very grifty.
@brownpaperbagyea2 ай бұрын
Maybe the thing we should question individuals without research backgrounds dropping models that beat the top of the line offerings. I’m not saying that it can’t happen however it seems many accept what he says as fact even in the face of controversy after controversy
@fynnjackson22982 ай бұрын
Love it when you go all philosophical. It would be cool to have you do a rant on the deeper ideas you have about what AI really is and how this all continues evolving into our future. I think AI is a mirror. We have an inspired thought that leads to an action, which then leads to us creating the idea in the physical world. So as we evolve our understanding within us, the technology and what we create outside of us is a kind of mirror or a kind of echo-feedback-loop of our inner journey. Essentially, we are using physical reality as a mirror to wake up to who and what we truly are. AI is just another chapter in this infinite, incredible journey. Buckle up - Things are getting awesome!
@ScottLahteine2 ай бұрын
If you remember that token prediction is based on everything available in the current context, that helps to make these models more useful. Maybe that explains why they are so bad at improvising anything very cohesive. Yesterday I needed a simple Python script to do a very specific set of checks on a text file, so I typed out the precise details of what I wanted in a step-by-step comment, and the model got the code 99% right the first time. “Prompting” is a good term, because you often have to do a lot of prompting to get what you want.
@draken53792 ай бұрын
Do you recall me showing you gpt3.5 years ago doing insane things ? Likes trying to email you, controlling an avatar etc ? Ya. Prompting is big :)
@Someone7R72 ай бұрын
I did the same thing and even way better with just a system prompt, this doesn't need fine tuning😒🤨😶
@ashleyrenee48242 ай бұрын
Thank you Matt 😊
@MagnusItland2 ай бұрын
I think the main problem with LLMs is that they are trained on human output, and humans often suck. LLMs are unlikely to learn native self-reflection by emulating Twitter and Reddit.
@Windswept72 ай бұрын
I forget that good prompting isn’t obvious to everyone.
@LjaDj5XQKey9mSDxh42 ай бұрын
Prompt engineering is actually a real thing
@dennisg9672 ай бұрын
I really dont get how a model could "reflect" on the answer it provided to give an even better answer. The initial answer it outputs is supposed to be the one with the highest probability already. How can it use it again to make another answer have even higher probability?
@kuromiLayfe2 ай бұрын
well if you take a trip to the store and the shortest route happens to be closed off, you will have to backtrack and take a bit longer route to get to the same destination. for your brain that is reflection as you made a mistake and had to go again to make a new decision to still get at the same endpoint.
@ShivaTD4202 ай бұрын
It's not picking a token that is the right answer. It's picking the next most likely token. It's just a coincidence that these two things align. If I ask you if yesterday was Sunday. You can just say yes, and be correct and put in minimal effort. You could also say you don't remember , or you aren't sure. These are also technically valid answers for the competition of your response. These "think about it" prompts are just forcing the model to use more neurons. If I asked you to talk about how you know yesterday was Sunday, or how you felt on Sunday. Then your using more neurons , and spending more Joules to respond.
@dennisg9672 ай бұрын
@@ShivaTD420, so you are saying that at first, the model is trying to give an answer while using little information or resources, but if a user prompts it to use more information/resources to come up with a better answer, it will do that? If that's what you mean, it sounds like an additional prompt from the user is needed. If the model were to prompt itself to use more info/resources, I don't see the point in figuring out the first, less complete, answer. Let me know what you think
@dennisg9672 ай бұрын
@@kuromiLayfe, but in your example, you gain more information by finding out that the first route is blocked off. How does the model gain more information between the initial response and the more thought out response?
@kuromiLayfe2 ай бұрын
@@dennisg967 branching thought processes.. you already have seen a different route on the way but your main one was cut off or wrong so you think about the other one you also already learned about.
@YaelMendez2 ай бұрын
It’s an amazing platform.
@ytubeanon2 ай бұрын
I randomly saw some of Matt Schumer's stream about reflection, he rubbed me the wrong way, seemed overly egotistical about "reflection"... you'd think there'd be some way to use A.I. to reverse engineer optimal prompts, have it run tests with the answer sheet overnight and it will rank the prompt templates that generated the best results... I would like to see a video with gpt-4o-mini-reflection
@travisporco2 ай бұрын
is it really true that they've established that the api was a wrapper for Claude? I don't think so.
@Copa207772 ай бұрын
Thanks matt ☀
@FRareDom2 ай бұрын
we need to wait for the 405b model to rlly say anything
@InsideYouTubeMinds2 ай бұрын
Wouldve been better if you named the video "NEW LLM MODEL HAS DRAMA" or something similar, i wouldve clicked instantly. but just hearing a new LLM doesnt excite many people
@DanieleH-t5v2 ай бұрын
Ok I’m no pro at this area of AI, but all I can gather is something shady is happening 😅
@iminumst78272 ай бұрын
From the beginning, I interpreted this model to be a prompt-engineering / architecture improvement to fine tune the model. I never expected a huge leap forward, and the "reflection" process does eat up some tokens. However, I had read papers that even just having an LLM double-check itself does noticeably improve performance. From my personal testing, I found that reflection did beat claude's free model in logic based questions. It's obviously no competitor to GPT-5, and I don't expect even the bigger reflection model to be. Sure, maybe for the benchmarks he just used some cherry-picking and prompt manipulation to make the model seem too powerful, but in reality it's still more powerful than Llama, so I don't see how it's a scam really.
@TLabsLLC-AI-Development2 ай бұрын
Exactly. 💯
@michelprins2 ай бұрын
"It's obviously no competitor to GPT-5 " how do u know that ??? maybe GPT-5 is just gtp 4.5 with the same trick build in we cant tell as there is no transparency! , behind the closed model wall, and also alot of paid for hype ! did u tried altmans video ai yet for example? Ipen source is the only way forward ! or pay 2000 dollar a month :P
@m2mdohkun2 ай бұрын
What's positive about this is I get a good system prompt? Noice!
@JustaSprigofMint2 ай бұрын
I'm turning 36 in 7 days. I'm really fascinated by AI. Is it still possible for me to get into programming or just an out-of.reach pipedream? I feel like I'm too late. I was never very confident in my programming skills in school and we only learned the basic stuff. Even C++ didn't make a lot of sense to me, while my elder brother was the best in his class. But I believe I want to work in this field. How/what can I do?
@monstercolorfunco43912 ай бұрын
Humans have parallel logic paths to double check.every step.of their maths, their count, their deductions, so we can.make a query take parallel checks in LLMs too. Volumetrically think of itlike traversing the NN on different paths and summing the.result. its a genius tweak. Inner convo is also like 3 4 brains working together through notes, so we can use a 70bn llm like 2x 70bn llms.
@agnosticatheist40932 ай бұрын
For me Mistral large enough is so far best model
@MONTY-YTNOM2 ай бұрын
I don't see it as an option in the LLM list now
@vickmackey242 ай бұрын
Only 67 Github contributions in the past year, doesn't know what LoRa is, and you think this guy is a serious AI leader/developer? C'mon.
@tommylir11702 ай бұрын
Absolute scam. Not only did they use a claude wrapper, but the reflection prompt made claude also perform worse 😂
@TheFeedRocket2 ай бұрын
I really think models will continue to get even smaller, actively learn, but not do everything. I only want to one day have my own model that can actively learn from me, as I talk to it, it will learn. Then it can learn about what I like, what I need, basically we should all be able to fine tune models we run locally on our devices or robots that know us, my model doesn't need to know everything. Also we should have many types of models that can talk to each other. An AI robot delivering my mail doesn't need to have a huge AGI model, it doesn't need to know how to fix cars, or build programs, solve science problems, heck if my garbage robot doesn't know how many r's in strawberry...who cares, it just needs basics and info on garbage disposal, types, toxins, interactions with life etc... I think the idea of one model to rule them all is wrong, example I would rather use Ideogram for logos etc.. and MidJourney for art, Flux for realism... We need AI that excels in certain areas, then talk to other AI that excels in another. AI agents and teams will be the future, might even be safer.
@Alex-nk8bw2 ай бұрын
The model might be a hoax, but the system prompt is working really well. That's something at least, I guess. ;-)
@SCHaworth2 ай бұрын
isnt "hyperbolic labs" kind of a red flag?
@robertopreatoni2 ай бұрын
Why is he streaming from his sister's bedroom?
@JohnWeas2 ай бұрын
YOOO MATT
@quercus32902 ай бұрын
nividia/microsofts, Megatron is a 500 billion model.
@domehouse792 ай бұрын
Nerds are entertaining.
@RenatoFlorencia2 ай бұрын
PAPO RETOOOOOOO
@jamessharkin2 ай бұрын
Have you ever used that comb you are vigorously waving around? 🤔😁😆
@MattVidPro2 ай бұрын
BAHAHAH 😅
@haroldpierre17262 ай бұрын
Lots of grifters during the AI hype train starting with Altman, Musk, etc. So, everything has be taken with a grain of salt.
@snintendog2 ай бұрын
Grifters... The people that made the most AI contributions but not every company under the sun calling a telephone system an ai..... Riiiiigghhhht
@haroldpierre17262 ай бұрын
@@snintendog Sometimes even our heroes lie.
@SpeedyCreates2 ай бұрын
@@snintendog😂fr thought the same, they ain’t grifters they all pisuehd the industry forward so damn much
@MusicalGeniusBar2 ай бұрын
Super confusing story 😵💫
@MattVidPro2 ай бұрын
Yeah and still not adding up...
@Norem1232 ай бұрын
Second
@thedannybseries88572 ай бұрын
lol
@ShiroAisan2 ай бұрын
oppp
@supermandem2 ай бұрын
AI died when Matt Schumer lied!
@SkyEther2 ай бұрын
Lmao with the how many Ls problem
@cbnewham56332 ай бұрын
16:47 ALLEGEDLY lied. Unless you want to be sued 😏
@TPCDAZ2 ай бұрын
He said apparently which works just fine, it means "as far as one knows or can see"
@cbnewham56332 ай бұрын
@@TPCDAZno he didn't. He said "we assume he would know more" followed by "he lied".
@cbnewham56332 ай бұрын
I doubt he will be sued, but sometimes these people can get bent out of shape and do silly things - especially if under fire. Personally, I wouldn't have said that and I'd have second thoughts about leaving it up. Matt clearly says he lied - that's slander.
@TPCDAZ2 ай бұрын
@@cbnewham5633 No he clearly says "Now apparently he's lied about the whole API situation with Claude" I have ears and so does everyone else. This video also has captions where it is written in black and white. So don't sit there and lie to people.
@michelprins2 ай бұрын
YOU NEED TO RETHINK HOW YOU PUT a commercial in the middleof ure message ure like the host that invited us for a nice dinner and in the middleof preparing u inform us ure taking a large dump taking all the apetite away. If u realy need that extra cash at least do it at the end like all other wise youtubers the way u do it now shows us u have more respect for the commercials then for your viewers not nice. And also give Matt Shumer a chance to show his method does work Aply the same scepsis to m.altmans claims like the ai video stuff were still waiting for ! Q star is now used for training theire bigest model and the only transparency "open" AI gave was a name change to strawberry with 3 r's and u all swallowed that like Altmans .... on a strawberry Its white but i wont asume its whipped cream without testing it . btw no need to comb ure hair . ;)
@gabrielkasonde3672 ай бұрын
First comment Matt
@InternetetWanderer2 ай бұрын
First?
@coinwhere2 ай бұрын
R Shumer has been made LLM related miscellaneous apps and that's it.