The ONLY Local LLM Tool for Mac (Apple Silicon)!!

Рет қаралды 14,876

1littlecoder

Күн бұрын

Пікірлер: 50

@CarlosValero Күн бұрын

Thanks for this!

@gue2212 2 ай бұрын

Thanks for making me aware of the MLX version. Beware: My installed version only updated to 0.2.31 and I had to download 0.3.4 from LMStudio AI!

@modoulaminceesay9211 2 ай бұрын

what is the difference between this and ollama

@maxziebell4013 2 ай бұрын

Thank I just installed it... nice M3 here

@1littlecoder 2 ай бұрын

Enjoy the speed!

@vigneshpadmanabhan Ай бұрын

Doesn’t support M4 yet?

@CitAllHearItAll 13 күн бұрын

I’ve been using it all week on M4.

@Pregidth 2 ай бұрын

Hey man, this is really great! Thanks. Hopefully Ollama is integrating it. They seem a bit lame past weeks.

@1littlecoder 2 ай бұрын

Hope so!

@phanindraparashar8930 2 ай бұрын

Can u makea video on Fine-tuning Embeddings and LLMs (also include how to create dataset to train on custom data) It will be very interesting

@1littlecoder 2 ай бұрын

Thanks for the idea, Will try to put together something!

@simongentry 8 күн бұрын

yes but how free are you to run a couple of llms at same time? especially if you’re code bouncing.

@modoulaminceesay9211 2 ай бұрын

Thanks for the tutorial

@usmanyousaf-i2i 2 ай бұрын

can we use this in intel mac..?

@1littlecoder 2 ай бұрын

you can use this, but the mlx bit won't work

@supercurioTube 2 ай бұрын

How did you conclude that running the same model was faster via MLX than with the llama.cpp backend? Comparing with llama-3.1 8b 8-bit, I get the same generation speed between LM Studio/MLX and Ollama/llama.cpp (33.6 tok/s on M1 Max 64GB)

@monkeyfish227 2 ай бұрын

Aren’t they both use mlx? Isn’t that the same speed then?

@CitAllHearItAll 13 күн бұрын

Are you loading the same model in different tools? You have to download the MLX model and GGUF versions separately. Then load one at a time and test. MLX is decently faster for me always.

@gregsLyrics 2 ай бұрын

WOW! Brilliant vid. M3 Max currently. What is the largest size model that can run? I can't wait to try this out. I want to train a model for my legal work. Fingers crossed this can help.

@monkeyfish227 2 ай бұрын

Depends on how much ram you have. Look at the models how big they are. You can only use around 70-75% of your ram for vram which is needed to load the entire model.

@adamgibbons4262 2 ай бұрын

Is there a model for Swift only programming?

@HealthyNutrition-y 2 ай бұрын

🔥🔵“Intelligence is compression of information.” This is one of the most useful videos I believe I have ever watched on KZbin.🔵

@ProSamiKhan 2 ай бұрын

One model is of Dhanush, and the other is of Tamanna. Can they both be prompted together in a single image? If yes, how? Please explain, or if there's a tutorial link, kindly share.

@esuus 2 ай бұрын

awesome, thanks! was looking for this. you could have gotten to the point a bit more, but whatever :D .mlx is the way to go!

@1littlecoder 2 ай бұрын

You mean gotten to the point sooner ?

@PiratesZombies 24 күн бұрын

is M2 8/512 work?

@benarcher372 20 күн бұрын

Anyone know a decent model for generation of Go code? Like for solving Advent of Code puzzles.

@1littlecoder 20 күн бұрын

try with qwen coder series of models

@benarcher372 20 күн бұрын

@@1littlecoder Thanks for the information! I'll try that on my M4

@benarcher372 20 күн бұрын

@@1littlecoder Now tested, very briefly, the lmstudio--community/Qwen2.5-Coder-32B-Instruct-MLX-8bit. So far good results. Nice to be able to do this 'off-line' (on a local machine)

@KRIPAMISHRA-rz7hg Ай бұрын

Whats your PC spec ?

@andrewwhite1576 9 күн бұрын

It’s a Mac so the one titled Mac specs😂

@build.aiagents 2 ай бұрын

Phenomenal 🤖

@alx8439 2 ай бұрын

Pls, give Jan AI a try. LM Studio is based on llama cpp, but proprietary closedsource and God only knows what it is doing - mining shitcoins, sending telemetry, collecting your personal data - you'll never know. Jan AI is open source and based on the same llama cpp and gets the same benefits as llama cpp gets

@zriley7995 17 күн бұрын

But we need mlx support 😢😢😢

@alx8439 17 күн бұрын

@zriley7995 original llama.cpp has it. LM Studio added ZERO to the under-the-hood functionally - just slapped its own UI on top of it

@Christophe-d9k 2 ай бұрын

With the presented qwen2-0-5b-instruct model(352.97 MB), It's about twice faster on your M3 max (221 tok/sec) than on my RTX 3090 ( 126 tok/sec) but, with the llama-3.2-3B-4bit model (2.02 GB) speeds are similar on both device. this is probably due to the amout of available vram (24GB on 3090)