Llamafile: bringing AI to the masses with fast CPU inference: Stephen Hood and Justine Tunney

  Рет қаралды 42,386

AI Engineer

AI Engineer

Күн бұрын

Пікірлер: 77
@Neltharion2k
@Neltharion2k 5 ай бұрын
This is so awesome! Just tried out llava 1.5 7b llamafile and it worked out of the box running on my CPU, without eating all of my RAM! The token generation speed was good enough for me! And my CPU is ~8 years old. Holy cow!
@bigglyguy8429
@bigglyguy8429 5 ай бұрын
Where gguf?
@geomorillo
@geomorillo 4 ай бұрын
where?
@dskbiswas
@dskbiswas 5 ай бұрын
What did I just watch ...mindblowing! Finally someone took the initiative of going against the tide while giving CPUs some attention that they have lost to the GPU madness!
@navodpeiris9054
@navodpeiris9054 5 ай бұрын
loving the llamafile already. this is how i deploy local LLMs now!
@LeftBoot
@LeftBoot 4 ай бұрын
Local for yourself or clients?
@longboarderanonymous5718
@longboarderanonymous5718 5 ай бұрын
These individuals are pioneers of the Personal AI. Efficient, Universal, and Economical.
@aeu126
@aeu126 5 ай бұрын
This was my favorite presentation!
@lolilollolilol7773
@lolilollolilol7773 Ай бұрын
Justine Tunney is a genius. Everything she does is undistinguishable from magic.
@granite_planet
@granite_planet 24 күн бұрын
Looking at her work, I sometimes think I should just quit programming and pick up something like gardening instead. :D
@Viewable11
@Viewable11 4 ай бұрын
Llamafile now supports OpenAI API and non-AVX CPUs. Finally! Having the OpenAI API is a must.
@deadlokIV
@deadlokIV 5 ай бұрын
Justine just shifted the timeline 💥🔀
@indylawi5021
@indylawi5021 5 ай бұрын
This is fantastic! I can't wait to try it out.
@gunnarasmussen207
@gunnarasmussen207 5 ай бұрын
Well, what I'm suppossed to say but: awesome...running local AI on normal consumer hardware without any worries about privacy seemed impossible just months ago. All the computational work in GPT, Gemini and others is done in the cloud on the companies servers. So you don't know, what they are doing with your data. Even if you have nothing to hide - I'm sure erveryone has certain things, he/she wants to stay private...this seems to be the right way of implementing AI in a private manner. And doing such a great afford without any commercial Interests is nothing but mindblowing. Keep up the good work, please!
@tejaslotlikar3573
@tejaslotlikar3573 5 ай бұрын
Now this is called achievement. Meanwhile the so-called "open"AI is looting people. You guys are awesome
@LaHoraMaker
@LaHoraMaker 5 ай бұрын
I really like the idea of a Threadripper configuration but... does anyone have a reference machine configuration for that? I'd like to compare the price to existing alternatives like the dual RTX4090 setup that is mentioned!
@leejacksondev
@leejacksondev 5 ай бұрын
This is utterly brilliant. What a fantastic presentation. Amazing project.
@Alice_Fumo
@Alice_Fumo 4 ай бұрын
well... I just took a look at the repo for the llama 3 70b llamafile repo and found this info about performance: "AMD Threadripper Pro 7995WX ($10k) does a good job too at 5.9 tok/sec eval with Q4_0 (49 tok/sec prompt). With F16 weights the prompt eval goes 65 tok/sec." 70b would be the lower bound for model I would enjoy using, but getting like 6 tokens per second output on a 10k$ CPU... At that point I could just as well build a GPU machine... So, even though I think this is in concept an amazing project, either it or hardware in general has a long way to go still before it is in my opinion usable for an average person such as myself.. (I'm assuming the performance data on the huggingface repo are at least somewhat accurate and not outdated)
@craigscott4205
@craigscott4205 5 ай бұрын
Justine an absolute champion!
@delq
@delq 5 ай бұрын
Awesome, exactly what I have been looking for, no more virtual heavy environments, no more heavy nvidea cuda drivers ! Lets fricking go !!!
@aiforsocialbenefit
@aiforsocialbenefit 5 ай бұрын
Awesome. Great project and presenters!
@raiumair7494
@raiumair7494 5 ай бұрын
Refreshing indeed - tokens per seconds is one measure and I like eval speed but what and how do you measure that?
@FirstNameLastName-fv4eu
@FirstNameLastName-fv4eu 5 ай бұрын
These cloud companies trying their best to keep the valuation high!!! This guy is the new CDO manager!!
@spookymv
@spookymv 4 ай бұрын
it was the first time I had the chance to listen to one of his speeches. bro i like this guy. D:
@dbreardon
@dbreardon 5 ай бұрын
He said,, "Who remembers using the original Netscape Navigator?" ........to that I say, who remembers using the original Mosaic browser? And then telnet before the graphical internet?
@WoodyWilliams
@WoodyWilliams 5 ай бұрын
[raises hand] Doh!
@tinkerman1790
@tinkerman1790 5 ай бұрын
“Who remembers the handshaking tone in dial-up process” 😂
@smthngsmthngsmthngdarkside
@smthngsmthngsmthngdarkside 4 ай бұрын
Who remembers the original smoke signals?
@Atonsha
@Atonsha 4 ай бұрын
How about BTX?
@vncstudio
@vncstudio 4 ай бұрын
We do! and Gopher!
@tollington9414
@tollington9414 5 ай бұрын
Absolutely fascinating and totally genius
@Jason_RA
@Jason_RA 4 ай бұрын
This is absolutely amazing!
@john_blues
@john_blues 4 ай бұрын
Is there a way to get Windows to run llamafiles bigger than 4Gb? Without being able to do that, it is very limiting in the models you can run.
@rayhere7925
@rayhere7925 5 ай бұрын
This is a game-changing breakthrough. Can't underplay this any other way.
@eggmaster88
@eggmaster88 5 ай бұрын
Awesome work!
@GandalfTheBrown117
@GandalfTheBrown117 4 ай бұрын
Justine is a GOAT
@NeXTOoOoOoO
@NeXTOoOoOoO 5 ай бұрын
Wow! Really great work!
@RomuloMagalhaesAutoTOPO
@RomuloMagalhaesAutoTOPO 4 ай бұрын
Amazing. Thank you.
@OranCollins
@OranCollins 4 ай бұрын
omg i love Justine Tunney! they are amazing!
@CaptainSpoonsAlot
@CaptainSpoonsAlot 4 ай бұрын
this is just fantastic.
@Godkidz7
@Godkidz7 5 ай бұрын
Freedom and Justices are more expensive than Money and Power. No one live and rule forever. Respects and Salute to you guys...
@KevinKreger
@KevinKreger 4 ай бұрын
Amazing❤
@XEQUTE
@XEQUTE 5 ай бұрын
Love it!!
@masbuba
@masbuba 5 ай бұрын
Oh shit, CPU prices is going to hike
@johnkost2514
@johnkost2514 5 ай бұрын
This is better than the Nvidia NIM solution (which is just containerization). Way better ..
@Charles-Darwin
@Charles-Darwin 5 ай бұрын
Awesomesauce
@ShieldsWebDesign
@ShieldsWebDesign 4 ай бұрын
Why is no one talking about this?
@GandalfTheBrown117
@GandalfTheBrown117 4 ай бұрын
Tired -> wired around @9:30 😂
@romanbauer
@romanbauer 5 ай бұрын
👏🏻👏🏻👏🏻
@philly_eddie
@philly_eddie 5 ай бұрын
very cool
@cholst1
@cholst1 4 ай бұрын
*checking on RAM prices*
@timchapman8539
@timchapman8539 5 ай бұрын
I need an AI that can access the files on my hard drive. Does anyone have a suggestion? I don't want to upload them to the AI. I want the AI to access them directly.
@bigglyguy8429
@bigglyguy8429 5 ай бұрын
ChatGPT4all has RAG
@constantinegeist1854
@constantinegeist1854 4 ай бұрын
All of this was already possible before... Already back in early 2023. What they did was just save you 15 minutes (otherwise you'd have to download an inference program and weights separately)
@7T7Soulz
@7T7Soulz 5 ай бұрын
this is future
@omercelebi2012
@omercelebi2012 4 ай бұрын
What about quality trade-off? Did they mention about that?
@JohnnysaidWhat
@JohnnysaidWhat 5 ай бұрын
this guy is a fkn rockstar on stage I was totally blown away 🎉
@erb34
@erb34 4 ай бұрын
Don't forget the browser.
@ravishmahajan9314
@ravishmahajan9314 4 ай бұрын
NVIDIA has hired CIA agents to make sure this technology is not reaching in hands of public. Be safe sir !😝
@hope42
@hope42 4 ай бұрын
Am I the only one that someone AI generated Matt Perry?
@fkxfkx
@fkxfkx 5 ай бұрын
well this feels like something out of left field.🤷‍♂️ Seems too good to be true. What are the catches?
@projectsspecial9224
@projectsspecial9224 5 ай бұрын
As an AI Design Engineer and developer of original works in Unified Language Models (predecessor to LLMs) for over 20 years, this compact framework, GPU or custom hardware independence, and resource efficient methodology is the correct approach. 😊
@fkxfkx
@fkxfkx 5 ай бұрын
⁠”a” correct approach but maybe not “the” correct approach. It’s not clear what downsides there are yet.
@bigglyguy8429
@bigglyguy8429 5 ай бұрын
@@fkxfkx I'm not sure how you're supposed to run it? GGUF I can run but what the heck is the 14GB "llamafile" thing?
@maxd3946
@maxd3946 5 ай бұрын
@@bigglyguy8429 actually, you don't need a 14GB llamafile. It's even unable to be run on windows (4GB max executable size limit). You can keep a llamafile without embedding any model in it and call it with the -m parameter to specify the model file to load.
@snow8725
@snow8725 5 ай бұрын
Fuck yeah!!!
@pandoraeeris7860
@pandoraeeris7860 5 ай бұрын
The Singularity is here.
@TalsBadKidney
@TalsBadKidney 5 ай бұрын
let's go to the gym
@JimAmos
@JimAmos 5 ай бұрын
Hats off for the engineering feat. But in terms of application, we are still just talking about text summarization. And the image generation in your own demo was just as disappointing as ever. There's no killer app for LLMs yet even though we keep throwing money and science at it. What are we even doing?
@bobtarmac1828
@bobtarmac1828 4 ай бұрын
Free candy, I mean, Free open source Ai for everyone. It’s a like a trick. Don’t fall for it. Cease Ai.
@WenRolland
@WenRolland 4 ай бұрын
Great work!
Official PyTorch Documentary: Powering the AI Revolution
35:53
The moment we stopped understanding AI [AlexNet]
17:38
Welch Labs
Рет қаралды 1,4 МЛН
When you have a very capricious child 😂😘👍
00:16
Like Asiya
Рет қаралды 18 МЛН
Beat Ronaldo, Win $1,000,000
22:45
MrBeast
Рет қаралды 158 МЛН
host ALL your AI locally
24:20
NetworkChuck
Рет қаралды 1,4 МЛН
The Future of Knowledge Assistants: Jerry Liu
16:55
AI Engineer
Рет қаралды 124 М.
Andrew Ng On AI Agentic Workflows And Their Potential For Driving AI Progress
30:54
Why AI Is Tech's Latest Hoax
38:26
Modern MBA
Рет қаралды 790 М.
Run ALL Your AI Locally in Minutes (LLMs, RAG, and more)
20:19
Cole Medin
Рет қаралды 297 М.
AI, Machine Learning, Deep Learning and Generative AI Explained
10:01
IBM Technology
Рет қаралды 743 М.
How I Made AI Assistants Do My Work For Me: CrewAI
19:21
Maya Akim
Рет қаралды 920 М.
When you have a very capricious child 😂😘👍
00:16
Like Asiya
Рет қаралды 18 МЛН