It is utterly amazing how much of a model can fit in 4GB, together with multi-platform runtime! Excellent work, Llama people, and thanks, Gary!
@GregRutkowski11 ай бұрын
Your presentation is the first, I have ever watch, where the chat worked as stated.Many others present chat app's, but fail in the detail of model types, e.g. I run Windows 10 and Ubuntu-Mate. Which by tthe way is the system I'm using to post this comment.
@ContraVsGigi11 ай бұрын
A real hit would be to have something like this and then give it access to the internet, so it could search and get up to date information.
@mandelafoggie93598 ай бұрын
Great idea
@jameswhitaker43576 ай бұрын
Representation Fine Tuning ReFt model would work here, idk about implementation tho. There’s probably a GitHub repo out there with this implementation already. Thanks for the lightbulb 💡!
@leledumbo11 ай бұрын
I just heard about this cosmopolitan in a programming forum, it's quite interesting for a true cross platform executable format, similar to a fat binary but different.
@edwardglynn319711 ай бұрын
Amazingly easy! HP Chromebook dragonfly in Linux with no issues. Just worked! First thing that EVER just worked in Linux on Chromebook
@ArianeQube11 ай бұрын
Any advantage to this than just installing LMStudio and running essentially any model on the huggingface repository?
@cjacobm9 ай бұрын
Excellent video! Based on this local installation, is it possible to make API requests, in the same way as chatGPT?
@GaryExplains9 ай бұрын
Yes, absolutely. Best option is to use LMstudio. I have a video about that. It has a built-in server.
@cjacobm9 ай бұрын
@@GaryExplains Thanks! I'll look for the video.
@DadaAustinDad10 ай бұрын
Hello, is it possible to use Llamafile to "chat" with files (pdf, doc, etc) in a local directory? Thanks.
@BillHawkins03189 ай бұрын
I do hope someone answers your question soon
@pbdivyesh11 ай бұрын
Thanks so much, I will run this on my windows machine and apple macbook! Just waiting for Snapdragon X Elite based windows computers too :)
@stevejones40108 ай бұрын
How to make it learn, or learn mode?
@natjes601711 ай бұрын
This was very cool! I´m gonna test this on my M1 Pro 😁 Thanx a bunch!
@GaryExplains11 ай бұрын
Have fun!
@PaulSpades11 ай бұрын
Well, I've tried it. It's rather stupid (not that much you can cram into a 4gb model), but it runs fine and seems to give reasonable responses to anything not too technical or programing related. What's the source of the training data? I'm not at all familiar with these open source llm projects.
@hilltophorrormoviereviews962811 ай бұрын
This looks really interesting and I'm gonna try it out. I have two basic questions: 1. Is there such a thing as a context window on locally installed LMMs and if so what is this one's? 2. Can you upload a file (txt, etc) to it that it can read?
@antonydecosta626211 ай бұрын
Thanks Gary..just tried and it works pretty well. Is there a way to train this Llm further?
@hamesparde988811 ай бұрын
The training stage is generally much more computationally intensive than just using the model that results from the training.
@skrolreapr7 ай бұрын
@@hamesparde9888 lol nice answer mate. the short answer is yes though
@r0galik11 ай бұрын
Interesting, I wonder about running it (natively) on Android - which is Linux under the hood...
@Riggzilla9 ай бұрын
Is there a guide to run the llamafile with your GPU? I'm using GPU passthrough on a VM.
@FilmFactry11 ай бұрын
What is the use case? Why use this than Bing/Bard/Claude? What can were we have more control?
@GaryExplains11 ай бұрын
As I often say to people, with the highest respect, if you can't see the use case then it isn't for you. This is true of any tech and equally true for me.
@asamirid11 ай бұрын
neat impressive project i see .. thank you for the instructions and the demo.
@mohammedakbarg684311 ай бұрын
Thanks Gary for sharing this with us!!! 😀 Really cool
@BrendonHolt11 ай бұрын
Well, not fully working for me. The web formatting is imbedded in the bot's reply.... User: hello Llama: sections=1&replytouser=1&template=hi\_there&style=2048&theme=default
@send2gl11 ай бұрын
Guessing then with a suitable link it could be incorporated into a web page, further guessing there are better ways of doing that than a 4GB file though.
@JoelJosephReji11 ай бұрын
Pretty cool!
@mohammedakbarg684311 ай бұрын
Waiting for your next video about Open Interpreter and this one file local chat GPT integration 🙂. How come you didn't talk about that possibility (which I think will be happening next) in this video 🤔
@TechMeldOfficial11 ай бұрын
I am currently downloading it, yet to test it !
@test4032311 ай бұрын
Huh, very interesting. What are the limitations offline? On another topic, has anyone apply LLM to detect computer viruses? I wonder how well it will work on reading xray images to detect cancers?
@GaryExplains11 ай бұрын
What do you mean by limitations offline, exactly? It functions the same offline and offline because it doesn't use any online resources at all.
@test4032311 ай бұрын
@@GaryExplains , not using online resource is even more interesting as you can ask it anything.
@GaryExplains11 ай бұрын
Not quite, that depends on how the model was trained.
@Makkenhoff11 ай бұрын
Going to give it a try; interested in how it works.
@12polizei24wegvonhier11 ай бұрын
Easy to install but non responsive on a raspi 5 with latest raspi OS. I probably need a Google GPU to make it more practical or be more patient
@Dhruv_s1211 ай бұрын
Anyone please give me simple instructions that how to use this LLM with GPU as well.
@GaryExplains11 ай бұрын
The details are on repo page. Basically you need Visual Studio installed and the CUDA SDK. If that is something beyond your skill set then try LMstudio instead.
@DamianMontero11 ай бұрын
Tried it on a 2020 Dell XPS 13 running ubuntu 23.10 (not WSL) with a i7-1185G7 with no Nvideo or AMD GPU and it was easy to run (just chmod and we're done) and it ran and replies as fast as ChatGPT (almost instantly) and seems to work just fine. It was SUPER easy and seems a LOT faster than ollama's default LLM and much easier to use. I don't mind ollama and with it's API I can do a lot with it,and this of course has an API (that's what the HTML front end hits) and so I have to try this on my current project. Amazing find! Thank you @garyexplains
@BillHawkins03189 ай бұрын
I have gigantic documents I would love to train this thing on
@iamseyi4real11 ай бұрын
Can you run it on Android tablet??
@hanspeter2411 ай бұрын
maybe when rooted
@iamseyi4real11 ай бұрын
@@hanspeter24 I thought as much, I already have a rooted android device. Just wondering how to go about it
@mikebakkeyt11 ай бұрын
Impressed it works but oh my - how confidently wrong it is.... I gave it a picture of a Vulcan from a 60's airshow and it told me it was a restored WWII aircraft with propellors and that there were 14 people around it. In actual fact there were approaching 100 in view but it gave an absolute and definitive number. I also gave it a picture of Chloe Grace Moritz and it assured me the her name was Emma Watson. The issue isn't that incorrect info is provided (I didn't expect it to even try tbh) but that is is presented as a fact. As this train isn't going to get stopped, I just hope it drastically improves how it presents what it does..
@GaryExplains11 ай бұрын
What you describe is a common complaint about every LLM, not just LLaVA.
@GaryExplains11 ай бұрын
PS. The nature of LLMs mean they have no concept of correct or incorrect. In fact they have no concept of anything, just what letters (tokens) should be output based on the tokens in the input.
@kiiikoooPT10 ай бұрын
Anyone knows if it runs on android with termux? lamacpp has a make option for android so it would be nice to be able to run it on android, I still didn't get into compiling for android with android studio, but in the lamacpp repo it says you can, so I wonder if they also made that binary for android in this file. If anyone know the answer please say something, Otherwise I will download and test it ;) Thanks in advance.
@GaryExplains10 ай бұрын
Try Sherpa, I haven't personally, but it is in Google Play. Repo here github.com/Bip-Rep/sherpa
@GaryExplains10 ай бұрын
BTW, it took me about 30 seconds to find it by googling "llama.cpp apk"
@kiiikoooPT9 ай бұрын
Thanks for the reply, but it does not work, only with bin files, no gguf support, and I was wondering about this one, not another one where you need to configure and download models anyway, but thanks ;) @@GaryExplains
@samfisher333611 ай бұрын
So we use cloud-based streaming services to listen to music but we run LLM locally!
@GaryExplains11 ай бұрын
Obviously there is a big difference between those two. Everything you type and all the images you upload and all the data you feed to a cloud based LLM gets sent from your computer to a company like Open AI/Google/Microsoft. For music you just pick a song.
@samfisher333611 ай бұрын
@@GaryExplains We gave up our privacy the moment we bought a smart device, even With the most primitive NPU. We're exposed and we got to live with that.
@GaryExplains11 ай бұрын
I disagree. I have plenty of data that isn't in the cloud and isn't exposed to the cloud. It is important to control what is in the cloud and what isn't. This is another tool for doing that. There are other reasons for using a local LLM, like research etc, but this was the one I picked due to your comparison with streaming music.
@arsalanganjeh1988 ай бұрын
Can I use my GPU to run it?
@MatthijsVanDijck11 ай бұрын
Is this fully offline? Or does it search for information on the Internet?
@GaryExplains11 ай бұрын
Fully offline.
@J-manoo711 ай бұрын
Can I just download this on my raspberry pi ?
@GaryExplains11 ай бұрын
Yes
@NeutronStar911 ай бұрын
Can anyone suggest how to build an app for android using AI?
@felipe36711 ай бұрын
Who are the people behind this llm ? Open ai?
@xeon2k811 ай бұрын
it's written in the github link
@GaryExplains11 ай бұрын
No not Open AI, this is an open source LLM, see llava-vl.github.io/
@ContraVsGigi11 ай бұрын
OpenAI is not "open" in the normal "open source" meaning. It is a company like any other.
@PaulSpades11 ай бұрын
@@ContraVsGigi correction: they used to be a non-profit opensource software organization. and then they flipped to a for profit company. surprisingly, this is happening more and more these past years (opensource projects turning commercial) like mysql, couchdb, audacity.