Local LLM Fine-tuning on Mac (M1 16GB)

Рет қаралды 22,432

Shaw Talebi

Күн бұрын

Пікірлер: 82

@ShawhinTalebi 5 ай бұрын

Really excited to finally get this working! I know many people had asked for it. What should I cover next?

@rdegraci 5 ай бұрын

I was able to test your Jupyter lab notebook and it generates the `adapter.npz` file and everything works! But how do I create a new model that has the `adapters.npz` embedded inside of it? I am running an Ollama server; how would we load it with this newly fine-tuned model, because we're using proprietary data so everything has to remain local to my machine and can't be uploaded to the Internet.

@AlissonSantos-qw6db 5 ай бұрын

Please, talk about the MLOps life cycle and how to implement it.

@ShawhinTalebi 5 ай бұрын

@@rdegraci Great question! The original mlx-example repo shows how to do this: github.com/ml-explore/mlx-examples/tree/main/lora#fuse-and-upload

@ShawhinTalebi 5 ай бұрын

@@AlissonSantos-qw6db Great suggestion. While I may not be the best source for MLOps, I can definitely include more details around implementation of specific use cases.

@aimattant Күн бұрын

Perfect. Looking forward to following your example in training the Phi-3 model later now I came across MLX.

@azadehbayani9454 3 ай бұрын

Wow, that was incredibly precise and helpful! Thank you, and keep up the fantastic work with your videos!

@JunYamog 5 ай бұрын

Thanks I have been using Unsloth remotely for fine tuning. Once the cloud bills start coming in, I am hoping to convince my boss that a macbook pro can be an option. My MLX are still just open tabs, glad to see someone doing it as well.

@ifycadeau 5 ай бұрын

Didn't know you could do this on Mac! Amazing, thank you!

@ShawhinTalebi 5 ай бұрын

🔥🔥

@eda-un8zr 3 ай бұрын

I binge watched your videos - high quality great content. Thank you so much, please keep it up!

@ShawhinTalebi 3 ай бұрын

Thanks for watching :)

@kaldirYT 5 ай бұрын

An easy video w/ great explanation to watch 👍🏽

@ShawhinTalebi 5 ай бұрын

Glad you liked it!

@pawel30w 5 ай бұрын

Thanks, great content! I really like the calm way you explain it all 👌

@chetanpun3937 5 ай бұрын

I was waiting for this video. Thank you so much.

@LucaZappa 2 ай бұрын

great tutorial, thanks. One question, I didn’t understand where is the fine tuned model on my Mac and is it possible to run the model in Ollama?

@ShawhinTalebi 2 ай бұрын

A folder should be created after training with the base model. Additionally, an adapters.npz file should appear which contains the adapters learned from LoRA. For running MLX models with Ollama, this video seems helpful: kzbin.info/www/bejne/aYa0aHqPbs2Brc0

@AbidSaudagar 5 ай бұрын

Amazing video. Thanks for sharing such valuable content.

@ShawhinTalebi 5 ай бұрын

Thanks Abid! I've been waiting 7 months for another video from you 😜

@AbidSaudagar 5 ай бұрын

@@ShawhinTalebi coming soon 😄

@PhilippeDiollot 2 ай бұрын

Nice telecaster !

@lorenzoplaatjies8971 5 ай бұрын

Love the video thank you for these concise tutorials! On initial inference before moving onto Fine-Tuning I can't get the generation step to produce any tokens.

@ShawhinTalebi 5 ай бұрын

Glad you like them :) Not sure what could be going wrong. Were you able to successfully install mlx_lm?

@lorenzoplaatjies8971 5 ай бұрын

@@ShawhinTalebi I appreciate you responding, I was able to figure it out! Thank you again for the video.

@inishkohli273 5 ай бұрын

Yes YES YES

@ShawhinTalebi 5 ай бұрын

Happy to help :)

@ISK_VAGR 5 ай бұрын

Really cool and helpful. Thank you very much. Have you perform fine-tuning in llama3.1 models successfully with this method?

@ShawhinTalebi 5 ай бұрын

I have not but it should be as easy as replacing "mistralai/Mistral-7B-Instruct-v0.2-4bit" with "mlx-community/Meta-Llama-3.1-8B-Instruct-4bit" in the example code.

@club4796 7 күн бұрын

is GPU cores in apple chips matter for fine tuning or Local Ai in general?

@ShawhinTalebi 6 күн бұрын

Yes it will enable faster training and inference :)

@LIMLIMLIM111 2 ай бұрын

Thank you, you are awesome

@futurerealmstech 3 ай бұрын

There are some rumors going around that 16GB should now be the standard memory configuration offered on the new Mac Mini. Any chance that when the M4 Mac Mini launches you can do a video on that as well?

@ShawhinTalebi 3 ай бұрын

Great suggestion! Haven’t heard that rumor but makes sense. I might be switching to a single (beefy) MacBook Pro, could do a break down of I use it for ML projects if there’s interest :)

@PutuDevDiaries 14 күн бұрын

What's next step after fine tuning LLMs to use on ollama?

@ShawhinTalebi 13 күн бұрын

I haven't done this, but this video looks helpful: kzbin.info/www/bejne/aYa0aHqPbs2Brc0si=YwO_q7IWbNZcL3L1

@AGI-Bingo 4 ай бұрын

What can i expect to achieve on M3 pro 64gb ?

@ShawhinTalebi 4 ай бұрын

You could likely run full fine-tuning on some smaller models (

@ArkaSuryawan 20 күн бұрын

I Got error for pip install mlx_lm note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for sentencepiece Failed to build sentencepiece ERROR: ERROR: Failed to build installable wheels for some pyproject.toml based projects (sentencepiece) what i have to do ?

@ShawhinTalebi 20 күн бұрын

I haven't seen that one before. Maybe can try install sentencepiece on its own.

@ArkaSuryawan 20 күн бұрын

i using python 3.13, that's problem i see on stackoverflow to downgrid python to 3.11, pip install sentencepiece @@ShawhinTalebi note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for sentencepiece

@ShekharSuman271991 2 ай бұрын

took me a moment to find this: parser.add_argument( "--data", type=str, default="data/", help="Directory with {train, valid, test}.jsonl files", ) worth mentioning that data file are picked up from /data by default

@ShawhinTalebi 2 ай бұрын

Thanks for calling this out!

@absar66 5 ай бұрын

thanks for the great video .. based on your varied experience, can you make a separate video on data-preparation techniques/methods for fine tuning related task on open source models . hope to get a response from Shaw-human than Shaw-GPT..(just kidding)..😅

@ShawhinTalebi 5 ай бұрын

Great suggestion! There can be a lot of art in data prep (especially in this context). Added it to the list.

@PutuDevDiaries 18 күн бұрын

when i run import, ModuleNotFoundError Traceback (most recent call last) Cell In[1], line 2 1 import subprocess ----> 2 from mlx_lm import load, generate ModuleNotFoundError: No module named 'mlx_lm'

@ShawhinTalebi 13 күн бұрын

Did you install the requirements from here?: github.com/ShawhinT/KZbin-Blog/blob/main/LLMs/qlora-mlx/requirements.txt

@imismailhan Ай бұрын

In linux how

@dharavivek 5 ай бұрын

Can i do it on mac m1 8gb ram

@ShawhinTalebi 4 ай бұрын

Might be worth a shot. You can try reducing batch size to 1 or 2 if you run into memory issues.

@DuxBarbosa 4 ай бұрын

it worked?

@ildaryakupov903 Ай бұрын

Why is the output of this fine tuned model is worse than the one you trained on colab?

@ShawhinTalebi Ай бұрын

That's a big question. We'd need to do a more robust eval to first determine if one is better than another. If significant differences do exist it's probably due to the details of how the fine-tuning was done.

@camperinCod43 3 ай бұрын

Any advice or guidance on how I could deploy this model so that I can use it as a Telegram bot? I've been able to plug it into Telegram's API and I'm able to get the bot up and running (locally on my mac), and well, I don't wanna keep my Mac alive just to run the bot! Cheers, thanks for the video!

@ShawhinTalebi 3 ай бұрын

Good question! Two options come to mind. 1) buy a Mac to serve your app or 2) rent an M-series Mac from cloud provider e.g. www.scaleway.com/en/hello-m1/

@daan3298 5 ай бұрын

Can I capture video and audio all day, with a camera in my shoulder, and finetune a model with the data every night?

@ShawhinTalebi 5 ай бұрын

Sounds like an interesting use case! This is definitely possible. Potential challenges I see are: 1) handling that much video data and 2) figuring out how to pass that data into the model (e.g. you could use a multi-modal model or find a effective way to translate it into text)

@daan3298 5 ай бұрын

@@ShawhinTalebi some steps in between to filter the input for usability could be handy. Maybe SAM?

@ShawhinTalebi 5 ай бұрын

@@daan3298 Without knowing any details, I can imagine that being helpful. Segment with SAM then object detection with another model.

@tuncaydemirtepe7978 5 ай бұрын

what if you have Apple M2 Max with 96gb memory? does that mean technically there is a 96gb memory GPU?

@ShawhinTalebi 5 ай бұрын

Good question. With the M-series chips there's no CPU vs GPU memory. The important thing here is using MLX allows you to make full use of your 96GB when training models!

@tuncaydemirtepe7978 5 ай бұрын

I ll give it a try

@saanvibehele8185 4 ай бұрын

Will this run on 8 gb memory?

@ShawhinTalebi 4 ай бұрын

It might be worth a try. You can also reduce the batch size if you run into memory issues.

@saanvibehele8185 4 ай бұрын

@@ShawhinTalebi I am running it now. 20 epochs have run successfully so far

@ShawhinTalebi 4 ай бұрын

@@saanvibehele8185 Awesome!

@acaudio7545 4 ай бұрын

I've been playing around with this trying to see how you'll respond if I made horrible comments about your content - managed to get one slightly angry response 😁. But on a serious note, I love the work and a big fan of the channel now!

@ShawhinTalebi 4 ай бұрын

LOL I wonder what that entailed