Hugging Face GGUF Models locally with Ollama

Рет қаралды 30,418

Learn Data with Mark

Күн бұрын

Пікірлер: 54

@TJTHEFOOTBALLPROPHET 8 ай бұрын

You had me 20 seconds into this video - INSTANT FOLLOW! 🔥 MUCH LOVE FROM NEW ORLEANS AND THANK YOU❤

@Seedlinux 6 ай бұрын

Thanks a lot for this! I was looking for a video that explained this exact topic and you did it in such a simple and efficient way. Kudos!❤

@davidtindell950 6 ай бұрын

Thank You. Very Useful and Very Timely!

@elias9298 Жыл бұрын

Thank you very much! After reading the documentation and spending 30 minutes asking GPT-4 (like an idiot) on how to do it, I was confused. Looks like I did something wrong with writing the path. Your video is clear and easy-to-understand .

@learndatawithmark Жыл бұрын

Glad it helped :)

@boysrcute 11 ай бұрын

and what was your fix? it's really bothering me

@NyxesRealms 8 ай бұрын

@@boysrcute To ensure the paths are correctly formatted to avoid parsing errors in scripts or tools that may interpret backslashes as escape characters, you should use double backslashes (\\) or replace them with forward slashes (/). You'll want to fix your modelfile path to have forward slashes.

@kelvinli2970 8 ай бұрын

How to do it in window?

@AlperYilmaz1 Жыл бұрын

I'm loving ollama.. it was a breeze to run a 7B model locally in my humble laptop.. If you can show us how to fine-tune a model locally and then use it with ollama, that would be awesome..

@learndatawithmark Жыл бұрын

That's on my list of things to figure out! Do you have any particular thing you'd like to fine tune for?

@AlperYilmaz1 Жыл бұрын

@@learndatawithmark I have some data which is bullet points of facts as input and a coherent paragraph from the bullet points as output. There are tons of tutorials but it's too chaotic, each one is using different base model quantization, prompt template, output format.. So, it would be great to create a model which can be run wiith ollama.. If you are interested, I can share the data..

@namannavneet587 10 ай бұрын

@@AlperYilmaz1 Hi sir, i am having same problem. Have you found a way to do it?

@introvert-techie 8 ай бұрын

Thanks for the video!

@ConsultingjoeOnline 11 ай бұрын

Great video, Thank you!

@fra4897 11 ай бұрын

what about the prompt template in the model file?

@learndatawithmark 11 ай бұрын

I didn't bother doing anything with that, but there are a lot more options for refining that than there were at the time I created the video. You can see all the options here - github.com/ollama/ollama/blob/main/docs/modelfile.md

@emil8367 11 ай бұрын

Thank you for sharing information 🙂 Is it possible to use Ollama SHA256:... files like to gguf / bin, etc to use it also in eg LM Studio / Autogen, etc or these files are useless outside Ollama due to the fact that is used hashing on them (if so) ?

@learndatawithmark 11 ай бұрын

That is a good question and I don't know the answer right now. Need to take a look at the Ollama code to see exactly what those files contain!

@bigglyguy8429 6 ай бұрын

Pro tip - use pretty much anything else EXCEPT Ollama, because Ollama demands you only use their version of GGUF files. Every other such software uses normal GGUF you can download yourself. Don't get trapped inside Ollama.

@EcoTekTermika 5 ай бұрын

You know you can use any model from HF in Ollama right?

@bigglyguy8429 5 ай бұрын

@@EcoTekTermika Only if you mess around creating model files for it and don't mind it being given a Sha hash as a name... which is my point, it's just a HF GGUF, but with extra steps that stop you using the models for anything else

@learndatawithmark 5 ай бұрын

Yeh it's frustrating - would be way better if you could use the HF models directly and they had a separate file for any of the meta data they create. Main reason I end up using Ollama a lot of the time is that I can't find a reliable place to get quantised versions of the models. I use to download them from TheBloke, but he stopped doing them since about January!

@GAllium14 Жыл бұрын

Great vid bro, you're very underrated

@learndatawithmark Жыл бұрын

Thanks, appreciate it!

@stanTrX 9 ай бұрын

thanks. what about lfs models? (many models don't have gguf model files?) do i skip something here?

@learndatawithmark 9 ай бұрын

I'm not sure what a lfs model is? Also you don't have to use Ollama to run models - there's always Hugging Face's transformers library which works with all their models too.

@alsoeris 7 ай бұрын

Is there no way to run a .gguf file that i already have downloaded? if not i guess ill have to stick with LMstudio & TGwebui

@learndatawithmark 7 ай бұрын

Nope, AFAIK you can't run gguf directly, you always have to convert it to Ollama's format. Other tools for running GGUF files directly are llama.cpp or llamafile in case you haven't heard of them!

@luisEnrique-lj4fq 5 ай бұрын

thanks a lot, ollama¡¡

@andremota247 5 ай бұрын

wouldnt it be nice to make a model that fuses them all In one master AI?

@DarkTrapStudio Жыл бұрын

I don't understand, Ive donwnloaded Ollama, run the first model, then hit : Ollama run dolphin-mixtral:latest But it was too slow, I don't understand all the part you went too, you used huggingface but I just want to run the model and install it I don't know anything about that poetry hugging face part

@learndatawithmark Жыл бұрын

If you only want to use one of Ollama's built in libraries you don't need to do any of the stuff in this video - you can do what you said. But keep in mind that dolphin-mixtral is one of the biggest models so it will be slower than the other ones. Perhaps try dolphin-mistral to see if that gives better performance.

@DarkTrapStudio Жыл бұрын

@@learndatawithmark I will try to figure everything out on how to to all this thanks

@erdagkucukdemirci 9 ай бұрын

What is the asitop alternative for Linux?

@learndatawithmark 9 ай бұрын

asitop says it's the alternative for nvtop, so perhaps one of those functions? "A Python-based nvtop-inspired command line tool for Apple Silicon (aka M1) Macs."

@DihelsonMendonca 6 ай бұрын

There must be a new and easy method currently to run GGUF models in Ollama or in Open WebUI. Please update this method. 🎉❤

@learndatawithmark 6 ай бұрын

As far as I know, this is still the way to run GGUF models with Ollama. I wish you could use GGUF files directly, it would be so much easier! I haven't used Open WebUI, I'll take a look at that. If you want command line tools that can run GGUF files directly, take a look at llamafile or llama.cpp github.com/ggerganov/llama.cpp github.com/Mozilla-Ocho/llamafile

@DihelsonMendonca 6 ай бұрын

@@learndatawithmark Thanks for answering. Indeed, my interest is in running them on Ollama, due to the new Open WebUI, which is the most marvelous thing invented. Open WebUI is a frontend to Ollama, presenting an interface like Chatgpt, with history of the conversations, talks with LLMs completely hands free, you talk and listen, and you can input your texts, PDFs, RAG, makes LLMs access internet in real time, upload images, multimodality, it's fantastic. You definitely need to test it. The problem with it is that it's based on Ollama, so, I use LM Studio, with dozens of Hugging face models, and I love them. I would like these models to be used in Open WebUI, but they are in GGUF format, that's why I found your video, in order to use gguf models in Ollama. 🙏👍💥

@GigsTaggart Жыл бұрын

Why all the complication instead of just using curl to get the GGUF from the website you were already on?

@learndatawithmark Жыл бұрын

Good question! I have used cURL a few times, but in the instructions it suggested that you should use the CLI tool. I haven't actually looked at the code to see if/what it does differently to cURL.

@MichaelDomer 9 ай бұрын

A clear example that shows that most nerds lack the skills and the willingness to provide simple solutions for non-nerds, the majority of us humans. It takes a less than a minute to install LM Studio and an AI model and have your code checked, a role played, questions answered, etc in private. Why are guys like you often ignore the regular users, and only focus on fellow nerds?

@learndatawithmark 9 ай бұрын

Hey - Thanks for your feedback. You're right - LM Studio is an easier approach, but I didn't know that it supported GGUF until I read your post. This video was also more about showing how you can use Ollama to run models even if the Ollama folks haven't already added it as a library. To be fair they do now seem to add new models so quickly that you rarely have that situation.

@starflyai 5 ай бұрын

no offense but you sound like jacksepticeye

@learndatawithmark 5 ай бұрын

I have no idea who that is! Should I be offended?!

@jalapenos12 7 ай бұрын

wtf poetry? noooooo why?

@learndatawithmark 7 ай бұрын

What should I use instead?!

@jalapenos12 7 ай бұрын

@@learndatawithmark Anaconda is my standard, but maybe I should learn poetry. I think I'm just frustrated that so many tutorials assume familiarity with so many tech stack options.

@sylvercloud7970 3 ай бұрын

When I try this I do get the newly added model listed, but when I run it it fails with an error Error: Post "127.0.0.1:11434/api/generate": EOF I have this problem on the the safetensor-coverted-to-GGUF model that I imported into ollama. Other much larger models like 34B Llava run fine. For conversion to GGUF I used ruSauron/to-gguf-bat method on Github. Any ideas where this went wrong? Thanks

@learndatawithmark 3 ай бұрын

Is there anything in the Ollama log file? ~/.ollama/logs/server.log It might also be worth seeing whether you can use the safetensors model directly. I showed how to that here - kzbin.info/www/bejne/eoSvqJWle699gZY Equally it might just be that the model isn't supported by llama.cpp, which is the underlying library that Ollama uses to run inference on the LLMs.

@sylvercloud7970 3 ай бұрын

@@learndatawithmark Thanks for the response. I did see your other video on using safetensors in ollama but ran into a roadblock right at the start 😊. I posted a message on that video too with a question on where to find “template” info but for some reason the message doesn’t appear. Template isn’t mentioned on the model page and I am at wits end trying to find it. I skipped template info and needless to say it didn’t work.

@sylvercloud7970 3 ай бұрын

I just retried posting a message on the other video hopefully it registers it this time. :))

@sylvercloud7970 3 ай бұрын

@@learndatawithmark I don't see a logs folder in .ollama. All that's in there is history id_ed25519 id_ed25519.pub I'm looking in the root folder. Is it elsewhere? Thanks

@sylvercloud7970 2 ай бұрын

I tried another conversion method using llama.cpp and on the last step this is what I got: INFO:hf-to-gguf:Loading model: GOT-OCR2_0 ERROR:hf-to-gguf:Model GOTQwenForCausalLM is not supported You were right, perhaps not every safetensor can be converted to gguf.