Local LLM Fine-tuning on Mac (M1 16GB)

  Рет қаралды 22,432

Shaw Talebi

Shaw Talebi

Күн бұрын

Пікірлер: 82
@ShawhinTalebi
@ShawhinTalebi 5 ай бұрын
Really excited to finally get this working! I know many people had asked for it. What should I cover next?
@rdegraci
@rdegraci 5 ай бұрын
I was able to test your Jupyter lab notebook and it generates the `adapter.npz` file and everything works! But how do I create a new model that has the `adapters.npz` embedded inside of it? I am running an Ollama server; how would we load it with this newly fine-tuned model, because we're using proprietary data so everything has to remain local to my machine and can't be uploaded to the Internet.
@AlissonSantos-qw6db
@AlissonSantos-qw6db 5 ай бұрын
Please, talk about the MLOps life cycle and how to implement it.
@ShawhinTalebi
@ShawhinTalebi 5 ай бұрын
@@rdegraci Great question! The original mlx-example repo shows how to do this: github.com/ml-explore/mlx-examples/tree/main/lora#fuse-and-upload
@ShawhinTalebi
@ShawhinTalebi 5 ай бұрын
@@AlissonSantos-qw6db Great suggestion. While I may not be the best source for MLOps, I can definitely include more details around implementation of specific use cases.
@aimattant
@aimattant Күн бұрын
Perfect. Looking forward to following your example in training the Phi-3 model later now I came across MLX.
@azadehbayani9454
@azadehbayani9454 3 ай бұрын
Wow, that was incredibly precise and helpful! Thank you, and keep up the fantastic work with your videos!
@JunYamog
@JunYamog 5 ай бұрын
Thanks I have been using Unsloth remotely for fine tuning. Once the cloud bills start coming in, I am hoping to convince my boss that a macbook pro can be an option. My MLX are still just open tabs, glad to see someone doing it as well.
@ifycadeau
@ifycadeau 5 ай бұрын
Didn't know you could do this on Mac! Amazing, thank you!
@ShawhinTalebi
@ShawhinTalebi 5 ай бұрын
🔥🔥
@eda-un8zr
@eda-un8zr 3 ай бұрын
I binge watched your videos - high quality great content. Thank you so much, please keep it up!
@ShawhinTalebi
@ShawhinTalebi 3 ай бұрын
Thanks for watching :)
@kaldirYT
@kaldirYT 5 ай бұрын
An easy video w/ great explanation to watch 👍🏽
@ShawhinTalebi
@ShawhinTalebi 5 ай бұрын
Glad you liked it!
@pawel30w
@pawel30w 5 ай бұрын
Thanks, great content! I really like the calm way you explain it all 👌
@chetanpun3937
@chetanpun3937 5 ай бұрын
I was waiting for this video. Thank you so much.
@LucaZappa
@LucaZappa 2 ай бұрын
great tutorial, thanks. One question, I didn’t understand where is the fine tuned model on my Mac and is it possible to run the model in Ollama?
@ShawhinTalebi
@ShawhinTalebi 2 ай бұрын
A folder should be created after training with the base model. Additionally, an adapters.npz file should appear which contains the adapters learned from LoRA. For running MLX models with Ollama, this video seems helpful: kzbin.info/www/bejne/aYa0aHqPbs2Brc0
@AbidSaudagar
@AbidSaudagar 5 ай бұрын
Amazing video. Thanks for sharing such valuable content.
@ShawhinTalebi
@ShawhinTalebi 5 ай бұрын
Thanks Abid! I've been waiting 7 months for another video from you 😜
@AbidSaudagar
@AbidSaudagar 5 ай бұрын
@@ShawhinTalebi coming soon 😄
@PhilippeDiollot
@PhilippeDiollot 2 ай бұрын
Nice telecaster !
@lorenzoplaatjies8971
@lorenzoplaatjies8971 5 ай бұрын
Love the video thank you for these concise tutorials! On initial inference before moving onto Fine-Tuning I can't get the generation step to produce any tokens.
@ShawhinTalebi
@ShawhinTalebi 5 ай бұрын
Glad you like them :) Not sure what could be going wrong. Were you able to successfully install mlx_lm?
@lorenzoplaatjies8971
@lorenzoplaatjies8971 5 ай бұрын
​@@ShawhinTalebi I appreciate you responding, I was able to figure it out! Thank you again for the video.
@inishkohli273
@inishkohli273 5 ай бұрын
Yes YES YES
@ShawhinTalebi
@ShawhinTalebi 5 ай бұрын
Happy to help :)
@ISK_VAGR
@ISK_VAGR 5 ай бұрын
Really cool and helpful. Thank you very much. Have you perform fine-tuning in llama3.1 models successfully with this method?
@ShawhinTalebi
@ShawhinTalebi 5 ай бұрын
I have not but it should be as easy as replacing "mistralai/Mistral-7B-Instruct-v0.2-4bit" with "mlx-community/Meta-Llama-3.1-8B-Instruct-4bit" in the example code.
@club4796
@club4796 7 күн бұрын
is GPU cores in apple chips matter for fine tuning or Local Ai in general?
@ShawhinTalebi
@ShawhinTalebi 6 күн бұрын
Yes it will enable faster training and inference :)
@LIMLIMLIM111
@LIMLIMLIM111 2 ай бұрын
Thank you, you are awesome
@futurerealmstech
@futurerealmstech 3 ай бұрын
There are some rumors going around that 16GB should now be the standard memory configuration offered on the new Mac Mini. Any chance that when the M4 Mac Mini launches you can do a video on that as well?
@ShawhinTalebi
@ShawhinTalebi 3 ай бұрын
Great suggestion! Haven’t heard that rumor but makes sense. I might be switching to a single (beefy) MacBook Pro, could do a break down of I use it for ML projects if there’s interest :)
@PutuDevDiaries
@PutuDevDiaries 14 күн бұрын
What's next step after fine tuning LLMs to use on ollama?
@ShawhinTalebi
@ShawhinTalebi 13 күн бұрын
I haven't done this, but this video looks helpful: kzbin.info/www/bejne/aYa0aHqPbs2Brc0si=YwO_q7IWbNZcL3L1
@AGI-Bingo
@AGI-Bingo 4 ай бұрын
What can i expect to achieve on M3 pro 64gb ?
@ShawhinTalebi
@ShawhinTalebi 4 ай бұрын
You could likely run full fine-tuning on some smaller models (
@ArkaSuryawan
@ArkaSuryawan 20 күн бұрын
I Got error for pip install mlx_lm note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for sentencepiece Failed to build sentencepiece ERROR: ERROR: Failed to build installable wheels for some pyproject.toml based projects (sentencepiece) what i have to do ?
@ShawhinTalebi
@ShawhinTalebi 20 күн бұрын
I haven't seen that one before. Maybe can try install sentencepiece on its own.
@ArkaSuryawan
@ArkaSuryawan 20 күн бұрын
i using python 3.13, that's problem i see on stackoverflow to downgrid python to 3.11, pip install sentencepiece @@ShawhinTalebi note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for sentencepiece
@ShekharSuman271991
@ShekharSuman271991 2 ай бұрын
took me a moment to find this: parser.add_argument( "--data", type=str, default="data/", help="Directory with {train, valid, test}.jsonl files", ) worth mentioning that data file are picked up from /data by default
@ShawhinTalebi
@ShawhinTalebi 2 ай бұрын
Thanks for calling this out!
@absar66
@absar66 5 ай бұрын
thanks for the great video .. based on your varied experience, can you make a separate video on data-preparation techniques/methods for fine tuning related task on open source models . hope to get a response from Shaw-human than Shaw-GPT..(just kidding)..😅
@ShawhinTalebi
@ShawhinTalebi 5 ай бұрын
Great suggestion! There can be a lot of art in data prep (especially in this context). Added it to the list.
@PutuDevDiaries
@PutuDevDiaries 18 күн бұрын
when i run import, ModuleNotFoundError Traceback (most recent call last) Cell In[1], line 2 1 import subprocess ----> 2 from mlx_lm import load, generate ModuleNotFoundError: No module named 'mlx_lm'
@ShawhinTalebi
@ShawhinTalebi 13 күн бұрын
Did you install the requirements from here?: github.com/ShawhinT/KZbin-Blog/blob/main/LLMs/qlora-mlx/requirements.txt
@imismailhan
@imismailhan Ай бұрын
In linux how
@dharavivek
@dharavivek 5 ай бұрын
Can i do it on mac m1 8gb ram
@ShawhinTalebi
@ShawhinTalebi 4 ай бұрын
Might be worth a shot. You can try reducing batch size to 1 or 2 if you run into memory issues.
@DuxBarbosa
@DuxBarbosa 4 ай бұрын
it worked?
@ildaryakupov903
@ildaryakupov903 Ай бұрын
Why is the output of this fine tuned model is worse than the one you trained on colab?
@ShawhinTalebi
@ShawhinTalebi Ай бұрын
That's a big question. We'd need to do a more robust eval to first determine if one is better than another. If significant differences do exist it's probably due to the details of how the fine-tuning was done.
@camperinCod43
@camperinCod43 3 ай бұрын
Any advice or guidance on how I could deploy this model so that I can use it as a Telegram bot? I've been able to plug it into Telegram's API and I'm able to get the bot up and running (locally on my mac), and well, I don't wanna keep my Mac alive just to run the bot! Cheers, thanks for the video!
@ShawhinTalebi
@ShawhinTalebi 3 ай бұрын
Good question! Two options come to mind. 1) buy a Mac to serve your app or 2) rent an M-series Mac from cloud provider e.g. www.scaleway.com/en/hello-m1/
@daan3298
@daan3298 5 ай бұрын
Can I capture video and audio all day, with a camera in my shoulder, and finetune a model with the data every night?
@ShawhinTalebi
@ShawhinTalebi 5 ай бұрын
Sounds like an interesting use case! This is definitely possible. Potential challenges I see are: 1) handling that much video data and 2) figuring out how to pass that data into the model (e.g. you could use a multi-modal model or find a effective way to translate it into text)
@daan3298
@daan3298 5 ай бұрын
@@ShawhinTalebi some steps in between to filter the input for usability could be handy. Maybe SAM?
@ShawhinTalebi
@ShawhinTalebi 5 ай бұрын
@@daan3298 Without knowing any details, I can imagine that being helpful. Segment with SAM then object detection with another model.
@tuncaydemirtepe7978
@tuncaydemirtepe7978 5 ай бұрын
what if you have Apple M2 Max with 96gb memory? does that mean technically there is a 96gb memory GPU?
@ShawhinTalebi
@ShawhinTalebi 5 ай бұрын
Good question. With the M-series chips there's no CPU vs GPU memory. The important thing here is using MLX allows you to make full use of your 96GB when training models!
@tuncaydemirtepe7978
@tuncaydemirtepe7978 5 ай бұрын
I ll give it a try
@saanvibehele8185
@saanvibehele8185 4 ай бұрын
Will this run on 8 gb memory?
@ShawhinTalebi
@ShawhinTalebi 4 ай бұрын
It might be worth a try. You can also reduce the batch size if you run into memory issues.
@saanvibehele8185
@saanvibehele8185 4 ай бұрын
@@ShawhinTalebi I am running it now. 20 epochs have run successfully so far
@ShawhinTalebi
@ShawhinTalebi 4 ай бұрын
@@saanvibehele8185 Awesome!
@acaudio7545
@acaudio7545 4 ай бұрын
I've been playing around with this trying to see how you'll respond if I made horrible comments about your content - managed to get one slightly angry response 😁. But on a serious note, I love the work and a big fan of the channel now!
@ShawhinTalebi
@ShawhinTalebi 4 ай бұрын
LOL I wonder what that entailed
@dogmediasolutions
@dogmediasolutions Ай бұрын
What about……. pip install mlx-lm wouldn’t that use your Apple Silicon’s GPU (and use MLX versions of LLMs in LM Studio)???
@ShawhinTalebi
@ShawhinTalebi Ай бұрын
That should work too! I also believe ollama uses the m-series chips effectively.
@jpcam4781
@jpcam4781 5 ай бұрын
Has anyone tried this on an 3.8GHz 8-core intel Core i7 chip?
@ShawhinTalebi
@ShawhinTalebi 5 ай бұрын
MLX is specifically made for M-series chips. This example won't work with an i7.
@livebernd
@livebernd 5 ай бұрын
How about fine tuning with an Intel processor on a Mac?
@ShawhinTalebi
@ShawhinTalebi 5 ай бұрын
MLX won't help, but if you have a graphics card there may be tools out there that can. I just haven't done that before.
@App-Generator-PRO
@App-Generator-PRO 5 ай бұрын
8GB RAM RIP :(
@ShawhinTalebi
@ShawhinTalebi 5 ай бұрын
LOL this still might be worth a try! If you run into memory issues you can reduce the batch size to 1 or 2. Curious to hear how it goes :)
@App-Generator-PRO
@App-Generator-PRO 5 ай бұрын
@@ShawhinTalebi risking my only device to wrack up is totally worth it
@ShawhinTalebi
@ShawhinTalebi 5 ай бұрын
@@App-Generator-PRO lol
@yogeshwarcm
@yogeshwarcm 19 күн бұрын
everytime bro says "only 16gb of memory" me with 8 :(
@ShawhinTalebi
@ShawhinTalebi 19 күн бұрын
LOL I’m sorry! You can still try with a smaller batch size ;)
FREE Local LLMs on Apple Silicon | FAST!
15:09
Alex Ziskind
Рет қаралды 221 М.
QLoRA-How to Fine-tune an LLM on a Single GPU (w/ Python Code)
36:58
Local LLM Challenge | Speed vs Efficiency
16:25
Alex Ziskind
Рет қаралды 123 М.
How to Improve LLMs with RAG (Overview + Python Code)
21:41
Shaw Talebi
Рет қаралды 91 М.
Fine-tuning Large Language Models (LLMs) | w/ Example Code
28:18
Shaw Talebi
Рет қаралды 385 М.
The ONLY Local LLM Tool for Mac (Apple Silicon)!!
13:49
1littlecoder
Рет қаралды 16 М.
RAG vs. Fine Tuning
8:57
IBM Technology
Рет қаралды 111 М.
Using Clusters to Boost LLMs 🚀
13:00
Alex Ziskind
Рет қаралды 88 М.
Qwen Just Casually Started the Local AI Revolution
16:05
Cole Medin
Рет қаралды 124 М.
Spying on external process I/O with Rust: ptrace and friends
32:50
Aleksandr Koshkin
Рет қаралды 171