i hope you find the video useful and don't forget to show (and brag about) your blended models!
@TheEarlVix9 ай бұрын
Thank you. Found this from your post on X.
@chrisBruner9 ай бұрын
Your videos are always very good and cutting edge.
@qiqqaqwerty17139 ай бұрын
Thanks for the very informative video. Cheers from "Down Under"!
@XUHANSHENG-i3c9 ай бұрын
videos are so great! i will be modest to learn as more as i can!
@Karthikarjunan166 ай бұрын
Hi beauty your youtube video link can't embed other app notion app and all you when publishing video before check KZbin studio mark video embed permission select
@qiqqaqwerty17139 ай бұрын
🎯 Key Takeaways for quick navigation, however this summary does not avoids you watch the complete video for a more in deep understanding:: Main Ideas: - 🌍 Model blending is an innovative approach to surpass the performance of high-cost models with limited resources. - 🤖 Non-experts can effectively blend models, demonstrating the technique's accessibility. - 💡 The blend allows for specialized functionality, combining models tuned for diverse tasks into a powerhouse model. - 🛠 The merging process involves selecting compatible models, defining parameters, and executing the blend with basic command line knowledge. - 🔄 Various blending methods like task vector arithmetic and SLURP offer unique advantages for custom model creation. - 📚 Proper selection and preparation of models are crucial, with a focus on architecture compatibility and avoiding common pitfalls. - 🏆 Blended models can achieve top rankings on leaderboards, though their position may fluctuate. - 🤔 The effectiveness of benchmarks in evaluating model intelligence is questioned, highlighting the issue of data contamination. Takeways: 00:00 *🤖 Introduction to Model Blending* - Introduction to the concepts of model blending, showcasing the power of combining models to overcome resource limitations and improve performance. - Highlights two models, Mixol and Ramonda, emphasizing the potential of model blending even with limited resources. 01:24 *📘 Basics of Model Blending* - Detailed explanation of model blending, its significance, and the methodology behind efficient blending. - Discusses the blending process, the importance of model selection, and the steps involved in creating a blended model. 02:05 *💡 The Promise of Blending* - Explores the potential of blending models to create top-performing LLMs without the need for extensive resources. - Focus on the accessibility of fine-tuning and blending for personalized model development. 03:33 *🛠️ How to Blend Models* - Provides a practical guide on blending models using MergeKit, including setup and execution steps. - Emphasizes the ease of blending models with basic knowledge and the right tools, offering an approachable method for enthusiasts and professionals alike. 05:33 *🧪 Detailed Blending Methods* - Deep dive into various blending techniques such as task vector arithmetic, SLURP, TIES, and DARE, explaining their unique applications and benefits. - Discusses the technical aspects of model blending, offering insights into choosing the right method for specific goals. 08:17 *🖥️ Preparing for Blending* - Guidelines on selecting compatible models for blending, emphasizing the importance of architecture and layer compatibility. - Instructions for downloading models from Hugging Face and preparing for the blending process. 10:33 *📝 Configuring YAML for Blending* - Step-by-step instructions on setting up YAML files for blending, highlighting the importance of specifying base models, merge methods, and parameters. - Offers practical tips for configuring blending parameters to optimize the blending process. 13:42 *🚀 Executing the Blend and Evaluation* - Detailed walkthrough of the blending execution using MergeKit and subsequent evaluation through a text generation interface. - Encourages testing and fine-tuning of the blended model before submission to benchmarks or public use. 15:45 *📊 Performance Testing and Data Contamination* - Discusses the significance of performance testing on open LLM leaderboards and addresses the issue of data contamination in model training. - Highlights the importance of careful model selection and blending strategy to avoid overfitting and ensure genuine improvements in model performance. I hope this helps everybody!
@Santiino9 ай бұрын
Its mindblowing to me how good your videos are yet you are still so unknown. Keep it up!
@Ludecan9 ай бұрын
Awesome video!! Been following your series on building AI agents and they're very good! Thanks for sharing!
@mysticaltech8 ай бұрын
Maya, you are good at this stuff. you are averaging my internal mind vectors to make Ai easy. Keep doing so!
@ivandukic9 ай бұрын
Wow, what an incredible explaination of merge methods. Thank you.
@s-miguel9 ай бұрын
This is destined to evolve into a meticulously curated, go-to channel of human reliability for years to come. Thank you very much for the exceptional quality you provide!
@Nifty-Stuff8 ай бұрын
Blending LLMs is a fascinating idea. The idea left me wondering: Why hasn't anybody developed a system/app that takes the API's from the top LLMs, created agents for each, and then have these agents all work together to brainstorm, debate, review, and solve problems? I often get 4 different answers from 4 LLMs, so why not have them all setup as agents "in one room" working together to come up with the "best" solution. I can't find anybody that's tried this... why not? Wouldn't having the "top minds" (LLMs) working together produce better results?
@bamh1re3187 ай бұрын
they could become worse, or give you 4 different answers, or could not stop talking around themsaelves
@knawdawg3 ай бұрын
A few reasons, APIs don’t give you any information about model structure and are pre customized. Most of the data is proprietary, and for it to even make sense you have to train it over and over. We are FAR from perfect, innovation is found by understanding flaws. Trail and error, societal brute force, I somehow have a local model on my phone that is very very generative, what’s next
@SebastianKreutzberger9 ай бұрын
Fantastic video, so well prepared, fool-proof explained, and a really cutting-edge topic. Best AI KZbinr out there - thank you 🙏
@EduGuti90009 ай бұрын
¡Gracias!
@Dan-zd1qr4 ай бұрын
I've learned a lot more about LLMs, architecture, and merging principles in just a few minutes.
@seanhynes95167 ай бұрын
Awesome, thanks for the gerat video. Very well explained, great diagrams! :)
@scienceandmind30659 ай бұрын
Great video and exactly what I need at the moment. Having a lot of specialized models for science, translation, coding, finance etc but no good way of combining them.
@maya-akim9 ай бұрын
best of luck! and share with us your results if you want :)
@blim4209 ай бұрын
Excellent walk through, thanks !
@HealthyNutrition-y2 ай бұрын
🔴🔵“Intelligence is compression of information.” This is one of the most useful videos I believe I have ever watched on KZbin.?
@robinmordasiewicz9 ай бұрын
wow, most sophisticated KZbinr ever. New favorite channel.
@deter33 ай бұрын
You are a really experienced AI engineer .
@VFraunhofer2 ай бұрын
That was a very interesting video, thanks for you work !
@jeremybristol43749 ай бұрын
Love the props with storytelling! Great instructional video!
@overcuriousity8 ай бұрын
interesting, easy to follow, well researched and critically scrutinized the results. like your content!
@leumas_tai9 ай бұрын
Great video. How does this differ from the Mixture of Experts (MOE)?
@maya-akim9 ай бұрын
that's an excellent question! first of all, I noticed that the community doesn't consider MoE to be merged models, even though you can use mergekit to create MoE yourself (instructions in the description box). My understanding is that blended models become "fixed" when it comes to their capabilities. MoE capabilities change dynamically thanks to gating mechanism that decides how much of each expert's advice to follow for a given input. You specify prompts (or simple strings with mergekit) that activate specific expert. For example, here's a configuration that I used for MoE: huggingface.co/mayacinka/West-Ramen-7Bx4 as you can see, positive and negative prompts will "guide" the model.
@leumas_tai9 ай бұрын
@@maya-akim interesting. thanks for sharing your thoughts I'll look it out.
@geekyprogrammer48319 ай бұрын
Very underrated channel!! This is enlightening. How a person can be so smart and beautiful too at the same time 😭😭
@InitGet3 ай бұрын
*thwap* First warning. Next time you go to horny jail... She is quite lovely though
@Rob_Steele9 ай бұрын
Great video Maya! Keep em coming! 😎
@hand-eye45179 ай бұрын
We thank you for all the amazing content and as such , being a great content creator , i dont wanna sound nitpicky , but since you are already attracting and leaning towards the DIY crowd you may as well be using the open source tools as well { vs Codium} etc. Just a small critique because i love the content.
@maya-akim9 ай бұрын
hey thanks for support and feedback 🙏🏻 I'm not sure I totally follow. Do you suggest that I switch to Codium? Honestly, before your comment I assumed that VScode is open source, but after googling a bit I realized that the product itself isn't actually. But I looks like Codium is os, so you think that that's a better fit for the channel?
@iasdfasdfasdfasdf3 ай бұрын
Wow, where did you go? Your video editing and the use of paper cutouts is so unique.
@GetzAI9 ай бұрын
Thanks Maya!
@ajay--yadav9 ай бұрын
lot of information about so many topics presented nicely.
@minae14239 ай бұрын
well articulated and educational video, thank you Maya!🙏🏼
@mayorc9 ай бұрын
Great video Maya. Keep it up ❕❕❕
@Cloudvenus6669 ай бұрын
What happens if you merge two models of the same family but they each have different context lengths? Does the model with the larger token window take precedence?
@maya-akim9 ай бұрын
it will depend on the "base model". But, in the cases that don't require defining a base model (like passthrough) or this hacky case here: huggingface.co/mayacinka/chatty-djinn-14B. when I merged models with 32K and 8K context window, the 32K models overpowered the 8K open chat model.
@Cloudvenus6669 ай бұрын
@@maya-akim thank you
@gerykis9 ай бұрын
Very good explanation. I'm looking for such easy to understand video how to fine tune a model locally .
@Linguisticsfreak7 ай бұрын
Since we don't have access to the training data, it is simply impossible/unfeasible to choose models based on whether they have or don't have contaminated data.
@lokeshart33409 ай бұрын
Can we blend multimodal models like llavaa and mistral and gemini vision? Can u make a video on it pls..❤❤
@maya-akim9 ай бұрын
oh that's interesting, I got to say I didn't try but I'm curious myself! I'll see how it goes and either I'll make a video or I'll let you know somehow
@lokeshart33409 ай бұрын
@@maya-akim sure.
@VFraunhofer2 ай бұрын
That's directly what I thought about, would be nice not to having to use a lot of differents models and make them communicate. @maya-akim : Did you try since ?
@lokeshart33402 ай бұрын
@@VFraunhofer yes i am still waiting
@CDIGS-EI-hv3cf4 ай бұрын
Hi Maya, thank you for this video, which is recently one of the best to this topic. But I'm sorry, I still haven't understand it yet... What do I have to do after the mergekit installation? How do I run the process? Can you make a video where you show the process? I also tried the model tree, but in my case, it has not shown anything...
@DemiGoodUA9 ай бұрын
Nice Video! Do we have the ability to fine tune the model on own codebase?
@amandamate91179 ай бұрын
can you test agent frameworks like CrewAi with Claude 3 opus?
@doomstertech83059 ай бұрын
great video, loved the explanation of all the technical stuff. Would love to know your process on how you read and understand these topics in-depth?
@xspydazx9 ай бұрын
Very good lesson and explanation ! So far the best on this subject .. as the main problem I have was running the models after . I could not find the definitive method to work ... Despite one of the models scoring high it could not run in the HF Inference plugin on the model card ..
@rein4369 ай бұрын
Very insightful 👍
@synchro-dentally19659 ай бұрын
Excellent video! The development outlook seems open to so many possibilities. I'm curious if anyone will find advantages in networks built via diffusions(similar to image generation) or if there will be more real time dynamics implemented as the model responds to a query.
@abdallamosa88368 ай бұрын
Is Combining tools like SWE-Agent, Crew AI, and OS-Copilot into a cohesive agentic workflow possible
@justinwhite27259 ай бұрын
LLMs catching up to something Stable Diffusion users have been doing for awhile. Open source is the way.
@axe8638 ай бұрын
Stacked and Cascading ensembling have been around for awhile
@JoelSiby-ju5pf8 ай бұрын
after that i could use my customized model from hugging face or locally on my app's?
@JoelSiby-ju5pf8 ай бұрын
also now that i have decided to use this model on my creating of gen-ai app's how would i load? llm = ??? # provide me the syntax for this
@abdallahtoba75685 ай бұрын
what's the difference between task arithmetic and Slerp
@ashgtd6 ай бұрын
it's been 3 months. More please
@amandamate91175 ай бұрын
why you stopped posting videos on this channel?
@Alf-Dee9 ай бұрын
Amazing video! I didn’t know it could be done. I am definitely going to make my own uncensored blended model for coding. I am tired of openai telling me that I should not modify/hack code without owner permission even if I am the owner, and I am trying to test how solid the code is…
@MiguelLopez-mu1ss9 ай бұрын
Thank you for the insights
@nimesh.akalanka7 ай бұрын
How can I fine-tune the LLAMA 3 8B model for free on my local hardware, specifically a ThinkStation P620 Tower Workstation with an AMD Ryzen Threadripper PRO 5945WX processor, 128 GB DDR4 RAM, and two NVIDIA RTX A4000 16GB GPUs in SLI? I am new to this and have prepared a dataset for training. Is this feasible?
@RahulBhalley9 ай бұрын
Thanks! What software are you running for loading and inferencing your merged LLM using localhost in browser?
@maya-akim9 ай бұрын
that's oobabooga's text generation UI. It allows you to run any model, whether it's saved locally, or on huggingface's hub
@RahulBhalley9 ай бұрын
thanks@@maya-akim
@yougaming82177 ай бұрын
Is it possible to merge 7B with 8B models?
@johnefan9 ай бұрын
Great Video👏🏻
@SanctuaryGardenLiving9 ай бұрын
The combination of spending time messing w ai along with your videos are inspiring me to build my own workstation. Not sure if that's smart considering I don't know how to code. So far I have ordered: super micro x12dai mobo 2 platinum 8352s 2 rtx 3090s 2 sata 12 tb 2 optane nvmes for os and quick retrieval stuff 128 gb lrdimm ddr4 E-ATX case, cables & ps Do you do any consulting work via zoom? I may need some direction soon.
@Karthikarjunan166 ай бұрын
Hi beauty your youtube video link can't embed other app notion app and all you when publishing video before check KZbin studio mark video embed permission select
@ulrichbeutenmuller81019 ай бұрын
thanks, great video!
@noblewarrior47769 ай бұрын
You are amazing… thank you
@inout33949 ай бұрын
LLM: Tokenization vs MAMBA, please make video about this
@BrandonFurtwangler9 ай бұрын
Why does Slerp only support two models? Can’t you just slerp between pairs, then slerp the slerps, etc until you have 1?
@maya-akim9 ай бұрын
yep, you absolutely can slerp the slerps of the previously slerped slerps. That's what a lot of people do.
@AacisMistake9 ай бұрын
My only question while watching was. Why should I make a model? I figure there is going to a be a infinite number of models being created by people and soon to be ai models created by ai models. So my question is, what is the point of making a custom model aside from fine tuning on data. I use autogen, would creating a model like your doing. empower a local model to let's say.. chat on my data, and be good at function calling? maybe this would be an experimental way to possibly make my own model specifically for autogen? Like Ik someone out there is already working on that specifically and even you showed those models specifically used for function calling in one of your other vids.
@maya-akim9 ай бұрын
oh that's a great question! here's how I would use it: 1. Find a model that scores highly on MMLU benchmark (which means that It has diverse knowledge). Blend it with a model that you like because of how its "vibe". For me that would be openchat because I like how conversational it is. The blended model would perform better than the two "parent" models. 2. I'm actually working on this one. I'm trying to fine tune one model to specifically be good at crafting youtube titles. And another one to write good youtube scripts. Than, I'll try to blend those two.
@tiberiumihairezus4179 ай бұрын
Great content.
@johntdavies9 ай бұрын
Maya, a great video, thank you. Quick question, where are you based? The reason I ask is I'm looking for an AI speak in the UK, you came to mind so was just wondering. Again, excellent video, amazing depth.
@maya-akim9 ай бұрын
hey John, thanks a lot for the support! I live in Austin, TX, so I'm afraid I won't be of any help :/
@johntdavies9 ай бұрын
@@maya-akim Damn, that's a long way away! Never mind, keep up the great work and thanks for getting back 🙂
@GuidedBreathing9 ай бұрын
3:40 and now add robots 🤖 cheers🥂
@wiktormigaszewski86842 ай бұрын
cool!
@SinanAkkoyun9 ай бұрын
Bach wtc 1 prelude 21 😍
@oryxchannel7 ай бұрын
Visual Studio for Mac being discontinued August 31, 2024.
@chuchel31569 ай бұрын
Nice video
@Maisonier8 ай бұрын
So it's like mixing colors, back in kindergarten, you'd always blend everything together hoping to create this amazing hue, but it always just ended up this muddy, ugly brown
@LouvoresPauloRicardo9 ай бұрын
Maya.. you are the girl!!!
@bgNinjashows9 ай бұрын
Genius
@siddhubhai25085 ай бұрын
Despite the fact India have most talented developers, in terms of AI development I think India is going behind 😢
@gareththomas32348 ай бұрын
why not just use autogen?
@free_thinker49588 ай бұрын
Autogen is full of crap
@khhamsafar2 ай бұрын
@maya-akim Why you are not uploading new videos in those days? Are you alive? 🙄
@zippytechnologies8 ай бұрын
At first - I was excited to see a new video with useful info - but when it got to that crime scene mapping thing you do - well... sorta creepy, no? What is that method called? Conspiracy mapping? Good visuals but wow... I lost track of what was going on with it... maybe it was more of a "Why are you putting holes in you walls? Some poor guy is gonna be like "...where's the spackle and putty knife? Some tenant/wife/daughter/kid poked a bunch of holes in my wall"... I never understood how so many holes got poked into my daughters walls or even our living room walls (ahem... the wife) but maybe this is just something that is fun to do? Now, do a video on how to patch all those little holes and get a paint roller with medium nap to repaint and cover everything up - but don't just paint a small area... no.. gonna probably have to paint the whole wall so there's no more streaks and visible coverups.. or at least learn how to feather out the edges so they blend better with the existing paint on the walls.. ugh... can't plug those holes and paint with an ai agent (yet)... so at least some skills are still worthy of known and learning... go get a guy or gal with some handy work skills - mechanical skills or something useful that AI can't do well any never will (likely for a long time) and you at least know your guy/gal will be useful given that AI will be putting lots of other people out of work (and is already doing so). I need to hire some people to help me get this working for our company - but I can't afford to keep paying drywall contractors every time we get a new idea... lol
@yellowboat87738 ай бұрын
Wow, you have too much time on your hands
@iasdfasdfasdfasdf3 ай бұрын
Textbook ADHD comment right here.
@PaulSchwarzer-ou9sw9 ай бұрын
🎉
@Dhirajkumar-ls1ws9 ай бұрын
👍
@ServerGamingTop1009 ай бұрын
It's not about collecting links with information and adding below the video... The important information is: which models are compatible and how to write the configuration file, which you barely mention! I can find all these links myself.
@user-he8qc4mr4i8 ай бұрын
Monoltic LLMs < MulitAgents
@rinokpp16928 ай бұрын
CAN I use agent on my mobile device
@JINGWA647 ай бұрын
problem with making vids that require prior knowledge and experience, is those who would find the information most useful, cannot make use of that information due requiring that prior knowledge and experience, yet at the same time the information provided in the vid is at the level to service a novice who had no prior interest, so who is the audience being catered to?
@NickDoddTV9 ай бұрын
Good soup
@LukasSmith8279 ай бұрын
your timing is scary
@maya-akim9 ай бұрын
what do you mean?
@VegascoinVegas5 ай бұрын
I noticed almost no accent but you say ".sli" instead of ".cli" and that seems to confirm you also speak an Eastern European language. I want to learn Serbian so I think your name is Maja. Just guessing. I wonder if you wanted to use AI to deep fake your voice to produce the exact same English narration in a different language, do you immediately know which model you would train with this textual script?
@florentflote9 ай бұрын
@PazLeBon9 ай бұрын
so a claude haha
@MichaelDomer8 ай бұрын
Too many video of AI nerds on TouTube... for AI nerds, hardly anyone makes videos for the average John and Jane, resulting in a large group people detached from AI.
@maya-akim8 ай бұрын
what types of videos would appeal to average John and Jane?
@MichaelDomer8 ай бұрын
Stop saying it's so simple... yes, for you it is.
@DudeBlox-y8m5 ай бұрын
haha.....true best to avoid that
@usmanthechamp1239 ай бұрын
@maya-akim blending would be the best word for this right, merging I think is the word people are using for it don't you think