Google being more open than OpenAI is the craziest turnaround of the past 3 years
@GeneralKenobi694209 ай бұрын
Nah, the biggest turnoaround is definitely Meta. Everyone hated Zucc just 5 years ago but since Llama happened he's basically become the messiah
@MrBrukmann9 ай бұрын
@@GeneralKenobi69420 He could do that or become irrelevant quickly; it was a strong business decision, but hardly praiseworthy. You're correct it's the biggest turnaround, as long as people understand it will turn right back around the instant it is more profitable to do so. Investor corporations can inadvertently do ethical and moral things, but legally it cannot actually be for ethical and moral reasons.
@enes-the-cat-father9 ай бұрын
Glove, BERT and T5 in the mean time : am I joke to you?
@ddd123439 ай бұрын
Can't wait to tell my employer I want to try tranformer mamba jamba in my project
@kalimero869 ай бұрын
It's refreshing to see a guy telling the AI news with no extra hype, no overreacting or fireworks all over the place.
@wolpumba40999 ай бұрын
*Abstract* This video summarizes recent advancements in the field of AI, encompassing new models, applications, research trends, and evaluation benchmarks. Highlights include the emergence of hybrid models like Jamba and efficient large language models like DBRX, demonstrating progress in performance and resource utilization. Research efforts delve into specialized applications like text-to-video generation, personalized vision-language models, and 3D content creation from text prompts. Additionally, the increasing use of synthetic data for training and the exploration of model merging techniques showcase the evolving landscape of AI development. Finally, the video covers updates on benchmark leaderboards and the release of valuable datasets for tasks like OCR, reflecting the ongoing efforts in evaluating and advancing AI models. *New AI Models* * *0:00* - *Jamba:* A hybrid model by AI22 labs that combines the Mamba architecture with attention layers. It achieves long context performance inference without excessive memory requirements. * *1:49* - *DBRX:* A state-of-the-art open LLM by Databricks with 132 billion parameters (36 billion active at a time). It uses a mixture of expert architecture and performs well in natural language understanding, programming, and math. * *4:01* - *Command R+:* A premium model by Cohere that offers improved performance and retrieval augmented generation capabilities. It supports multiple languages and is trained for citations and tool use. * *5:36* - *Mistral 7B:* A new base model by Mistral AI with a 32,000 token context window. It serves as the foundation for their new instruct model. *Advances in Specific AI Applications* * *5:57* - *Video Poet (Google Research):* A text-to-video model capable of zero-shot video generation, image-to-video stylization, and other text and video intermingling tasks. * *6:34* - *Magic Lens (Google DeepMind):* An image retrieval system using open-ended instructions and trained with synthetic data generation. * *7:48* - *Cosmopedia (Hugging Face):* An open-source reproduction of Microsoft's Fi models, which are trained on synthetically generated textbooks. * *8:54* - *Long Form Factuality (Google DeepMind):* Research focused on improving factuality in large language models through a new dataset, evaluation method using LLM agents, and a model tackling long-form factual consistency. * *10:20* - *MyVLM (Snap & Tel Aviv University):* This paper explores the personalization of vision-language models by teaching them concepts like "me", "my dog", and "my friend" to understand and respond to prompts from an egocentric perspective. * *11:28* - *Latte 3D (Nvidia):* A text-to-3D model capable of generating high-quality textured meshes from text prompts in just 400 milliseconds. * *12:13* - *SceneScript (Meta):* A model aimed at understanding physical spaces to support augmented reality applications. It reconstructs and annotates environments using synthetic data for training. * *12:48* - *CHAMP:* A human image animation method that allows realistic and dynamic movement generation from a single image. *Emerging Trends and Research* * *13:40* - *Evolving Foundation Models (Sakana AI):* Research into evolutionary model merging, which combines fine-tuned versions of the same model using techniques inspired by neural architecture search. * *14:37* - *Rad Splat (Google Research):* A novel approach to rendering radiance fields at high speeds (over 900 frames per second) using Gaussian splatting. * *15:17* - *MoAI (Salesforce AI Research):* A time-series foundation model for universal forecasting, aiming to provide a single model for various types of time-series data. * *17:03* - *H2O Den 2 (H2O.ai):* A 1.8 billion parameter foundation language model trained on a massive dataset and performing well against similar-sized models. * *18:19* - *Octopus V2 (Nexa AI):* A 2 billion parameter model optimized for tool use and function calling within the Android API, enabling natural language interaction with Android devices. * *18:49* - *Dolphin 2.8 (Mistral AI):* An uncensored fine-tune of Mistral 7B, removing bias and alignment limitations while requiring users to implement their own safety measures. * *20:32* - *Reaching Llama 2 Performance with $0.1 Million (JetMoe):* Exploration of cost-efficient training methods for large language models, achieving performance comparable to Llama 2 with significantly lower resources. * *21:57* - *Quen 1.5e:* A language model demonstrating that models with fewer active parameters (through techniques like Mixture of Experts) can achieve similar performance to larger, densely activated models. *Evaluation and Benchmarks* * *23:24* - *LLMs Chatbot Arena Leaderboard:* Claude 3 and Opus are currently leading, with the smaller Starling 7B model showing competitive performance. * *24:37* - *Berkeley Function Calling Leaderboard:* Claude 3 tops this leaderboard which evaluates the ability of language models to call functions effectively. * *25:05* - *TREC Deep Learning Track on Retrieval Augmented Generation:* The Text REtrieval Conference (TREC) now includes a track focused on evaluating retrieval augmented generation systems. * *25:33* - *Photographic Memory Evaluation Suite (Lamini):* A collection of benchmarks designed to assess a model's ability to accurately recall and match specific information. * *26:25* - *Two Largest Open Datasets for OCR Released:* Clement Delangue highlights the release of two extensive datasets aimed at advancing optical character recognition (OCR) technology. i used gemini 1.5 pro to summarize the transcript Token count 10,365 / 1,048,576
@guillaumevermeillesanchezm24279 ай бұрын
Aaaaaaaaaaaah what an amazing Monday!
@sofia.eris.bauhaus9 ай бұрын
happy monday to all who celebrate!
@Dron0089 ай бұрын
Who stole my Sunday?
@rubberduckdebug9 ай бұрын
I swear there's more AI startups than actual models. Neat stuff though!
@daan32989 ай бұрын
Wonderful Monday to you too!
@vko-reche9 ай бұрын
The references are missing?
@XOPOIIIO9 ай бұрын
Testing factuality model on human evaluation is the worst idea possible. Humans, even smartest ones are usually bad in perceiving fact. To make a model factually correct you should optimize for not contradicting itself. It should ask questions to itself and answer them, finding and solving any contradictions. Correcting it's views through constant "cognitive dissonance". That is the only way to create objective worldview.
@keypey82569 ай бұрын
I think it should be able to contradict itself as long as it later adds that actually something else is true if it notices a mistake. It would be better than it hallucinating stuff
@zaripych9 ай бұрын
Hmm. How does it know it contradicts itself? Does it need to then go and verify its reasoning of whether it contradicts or not? Seems like an infinite loop. They only seem to reason ... that SAFE thing is a nice idea, but it just decreases chances of having a bad hallucination ... and leaving us with good ones, verified ... using other hallucinations. So, it's still a hallucination. 😅
@XOPOIIIO9 ай бұрын
@@zaripych If LLM can answer questions and ask question, they they definitely could find contradictions. Like I believe A and I believe B, does B contradict A or not? I think LLM can be pretty good in identifying that.
@zaripych9 ай бұрын
@@XOPOIIIO I agree, they are pretty good, until they are not. I personally speak from experience, I was riding this hype train for months and using LLM's for code analysis and stuff. Sometimes they just generate these contradictions. You state the obvious in the prompt and add more context that proofs otherwise and they are still "insisting" on something dumb. I'm not saying it's completely useless, it's just cannot be used for anything important, without human supervision. Add supervision and you are good I guess.
@XOPOIIIO9 ай бұрын
@@zaripych Yes, I mean we should make LLMs to admit mistakes and learn on them.
@grafzhl9 ай бұрын
In this video: Yannic discovers dark patterns in cookie banners.
@bentationfunkiloglio9 ай бұрын
I don’t know why but these videos make me anxious. I need to start drinking before watching.
@Hexanitrobenzene9 ай бұрын
I bet it's a feeling of change happening faster than we can adapt...
@HoriaCristescu9 ай бұрын
You used to link all the original sources, can you add this back?
@JorgetePanete9 ай бұрын
There's also RWKV-5 and RWKV-6
@eduarddronnik51559 ай бұрын
It's like Saturday though? Have a nice Monday too 😂
@guillaumevermeillesanchezm24279 ай бұрын
Nah, when ML news are released, this is Monday!
@neelsg9 ай бұрын
(P1) ML News come out on Mondays. (P2) ML News came out today. (C) Therefore, today is Monday. My logic is undeniable
@ugthefluffster9 ай бұрын
isn't the Q* description very similar to JEPA?
@darnokjarud99759 ай бұрын
Yeah, Yann LeCun was right again
@bryce.ferenczi9 ай бұрын
@@darnokjarud9975 Its not a very novel idea...
@madichelp09 ай бұрын
Garment3DGen is predicting a future where we walk around naked with our AR headsets on and it just virtually applies clothes to everyone.
@wwkk49649 ай бұрын
You are my fav AI KZbinr!
@andrzejreinke9 ай бұрын
is there somewhere list of papers from the video?
@GeoffLadwig7 ай бұрын
great, thanks
@Snake3699 ай бұрын
cool stuff thanks for sharing
@serta57279 ай бұрын
Lots of things going on 🤯
@sonOfLiberty1009 ай бұрын
Hey Yannic, I love that you are really humble and and analyze things pretty rational. Lately the AI hype gets out of hand.
@JorgetePanete9 ай бұрын
Did you say AI22 at the start?
@KolTregaskes9 ай бұрын
It's Saturday, Yannic. :-)
@otrqffaimajg9 ай бұрын
This guy wears sunglasses to look at his screen Meanwhile I'm blind looking at his light mode.
@sofia.eris.bauhaus9 ай бұрын
doesn't "open weight" mean that the model itself is freely licensed, but not the training data/process? i think that's what most "open source" models are, but i haven't really looked into that. if the weights are publicly available but with no permission to use, modify and redistribute it that i guess that should be called "weights available", analogous to "source available" licensing.
@juanjesusligero3919 ай бұрын
Best monday ever.
@DrNioky9 ай бұрын
Watching this on Monday like "woah he predicted it, AI is taking over"
@MariuszWoloszyn9 ай бұрын
Monday?
@asdasd-yr7wi9 ай бұрын
I am shocked, normally you link the stuff you are talking about in the description?! Pls give me some sources. I need them dearly.
@Raminber9 ай бұрын
Name of Soundtrack?
@vitalic_drms9 ай бұрын
JAMBA is now, JAMBA is forever, if you don't know about JAMBA its probably over for you
@johnheilman8189 ай бұрын
I feel like them are looking at how to learn from far less data. If they succeed, I predict we'll have another burst of capabilities.
@nobo66879 ай бұрын
Now paper „ no context left behind“
@TylerMatthewHarris9 ай бұрын
What time zone are you in? It’s like 4 AM where I’m at, lol . Love it
@100c0c9 ай бұрын
Somewhere in Europe I would assume.
@Hexanitrobenzene9 ай бұрын
Yannic is from Switzerland, UTC +1.
@pedrogorilla4839 ай бұрын
Thank goodness a no hype AI news KZbin video.
@gileneusz9 ай бұрын
week behind in AI means this video is very very old, just like few years old, if you compare it to other topics
@TheTruthOfAI9 ай бұрын
its funky.. augmented era..where LLM's are trained over their own long hallucination.. i mean, if u want a non-sense parrot... but how self-augmented trash can have any ground reality .. like the capital of France is Kuala Lumpur.. and how nice were holidays at KL, France XD
@OrlyLiba9 ай бұрын
Thanks for the AI news! Can you please add time stamps to your videos? That would be really useful and appreciated. Thanks!
@Hexanitrobenzene9 ай бұрын
Hm, he used to do that. The references are also missing. He seems to be in a hurry.
@BooleanDisorder9 ай бұрын
I'd love to see a byte based model with SSM and Transformer layers. 1.58 bit
@Wobbothe3rd9 ай бұрын
There's nothing illegal about training an AI on copyrighted data. Copyright protects against distribution of copyrighted content, merely learning from copyrighted content doesn't automatically make all of the output of that model "derivative work." If a model can be used to reproduce content that's one thing, but that has to be shown to be possible through the actual rules of the product, and not elaborate hacks.
@eranfeit8 ай бұрын
Please add chapters to you videos
@Calbefraques9 ай бұрын
I just pretend I’m a time traveler 🎉
@kevinamiri9099 ай бұрын
More than one week latency
@asnaeb29 ай бұрын
Mamba numba 9
@kikijuju48099 ай бұрын
I hate mondays
@Falamhi9 ай бұрын
AI news are fast it is Monday already
@existenceisillusion65289 ай бұрын
The Q* thing dropped about a month ago. Not from OpenAI, but does everything the 'leak' claims.
@sonOfLiberty1009 ай бұрын
Q*-Leak is 100% fake. people discussed this in reddit and most of the people said this is bs