Thanks for making me aware of the MLX version. Beware: My installed version only updated to 0.2.31 and I had to download 0.3.4 from LMStudio AI!
@modoulaminceesay92112 ай бұрын
what is the difference between this and ollama
@maxziebell40132 ай бұрын
Thank I just installed it... nice M3 here
@1littlecoder2 ай бұрын
Enjoy the speed!
@vigneshpadmanabhanАй бұрын
Doesn’t support M4 yet?
@CitAllHearItAll13 күн бұрын
I’ve been using it all week on M4.
@Pregidth2 ай бұрын
Hey man, this is really great! Thanks. Hopefully Ollama is integrating it. They seem a bit lame past weeks.
@1littlecoder2 ай бұрын
Hope so!
@phanindraparashar89302 ай бұрын
Can u makea video on Fine-tuning Embeddings and LLMs (also include how to create dataset to train on custom data) It will be very interesting
@1littlecoder2 ай бұрын
Thanks for the idea, Will try to put together something!
@simongentry8 күн бұрын
yes but how free are you to run a couple of llms at same time? especially if you’re code bouncing.
@modoulaminceesay92112 ай бұрын
Thanks for the tutorial
@usmanyousaf-i2i2 ай бұрын
can we use this in intel mac..?
@1littlecoder2 ай бұрын
you can use this, but the mlx bit won't work
@supercurioTube2 ай бұрын
How did you conclude that running the same model was faster via MLX than with the llama.cpp backend? Comparing with llama-3.1 8b 8-bit, I get the same generation speed between LM Studio/MLX and Ollama/llama.cpp (33.6 tok/s on M1 Max 64GB)
@monkeyfish2272 ай бұрын
Aren’t they both use mlx? Isn’t that the same speed then?
@CitAllHearItAll13 күн бұрын
Are you loading the same model in different tools? You have to download the MLX model and GGUF versions separately. Then load one at a time and test. MLX is decently faster for me always.
@gregsLyrics2 ай бұрын
WOW! Brilliant vid. M3 Max currently. What is the largest size model that can run? I can't wait to try this out. I want to train a model for my legal work. Fingers crossed this can help.
@monkeyfish2272 ай бұрын
Depends on how much ram you have. Look at the models how big they are. You can only use around 70-75% of your ram for vram which is needed to load the entire model.
@adamgibbons42622 ай бұрын
Is there a model for Swift only programming?
@HealthyNutrition-y2 ай бұрын
🔥🔵“Intelligence is compression of information.” This is one of the most useful videos I believe I have ever watched on KZbin.🔵
@ProSamiKhan2 ай бұрын
One model is of Dhanush, and the other is of Tamanna. Can they both be prompted together in a single image? If yes, how? Please explain, or if there's a tutorial link, kindly share.
@esuus2 ай бұрын
awesome, thanks! was looking for this. you could have gotten to the point a bit more, but whatever :D .mlx is the way to go!
@1littlecoder2 ай бұрын
You mean gotten to the point sooner ?
@PiratesZombies24 күн бұрын
is M2 8/512 work?
@benarcher37220 күн бұрын
Anyone know a decent model for generation of Go code? Like for solving Advent of Code puzzles.
@1littlecoder20 күн бұрын
try with qwen coder series of models
@benarcher37220 күн бұрын
@@1littlecoder Thanks for the information! I'll try that on my M4
@benarcher37220 күн бұрын
@@1littlecoder Now tested, very briefly, the lmstudio--community/Qwen2.5-Coder-32B-Instruct-MLX-8bit. So far good results. Nice to be able to do this 'off-line' (on a local machine)
@KRIPAMISHRA-rz7hgАй бұрын
Whats your PC spec ?
@andrewwhite15769 күн бұрын
It’s a Mac so the one titled Mac specs😂
@build.aiagents2 ай бұрын
Phenomenal 🤖
@alx84392 ай бұрын
Pls, give Jan AI a try. LM Studio is based on llama cpp, but proprietary closedsource and God only knows what it is doing - mining shitcoins, sending telemetry, collecting your personal data - you'll never know. Jan AI is open source and based on the same llama cpp and gets the same benefits as llama cpp gets
@zriley799517 күн бұрын
But we need mlx support 😢😢😢
@alx843917 күн бұрын
@zriley7995 original llama.cpp has it. LM Studio added ZERO to the under-the-hood functionally - just slapped its own UI on top of it
@Christophe-d9k2 ай бұрын
With the presented qwen2-0-5b-instruct model(352.97 MB), It's about twice faster on your M3 max (221 tok/sec) than on my RTX 3090 ( 126 tok/sec) but, with the llama-3.2-3B-4bit model (2.02 GB) speeds are similar on both device. this is probably due to the amout of available vram (24GB on 3090)
@SirSalter26 күн бұрын
Let’s go ahead and say “go ahead” every other sentence
@1littlecoder25 күн бұрын
@@SirSalter did I use it to much 😭 sorry
@judgegroovyman17 күн бұрын
@@1littlecodernah youre perfect. That guy is just grumpy and thats fine :) you rock!
@1littlecoder17 күн бұрын
@@judgegroovyman thank you sir ✅
@theycallmexavierКүн бұрын
Jan is better
@1littlecoderКүн бұрын
I don't think they have got mlx support have they ? I have a Jan video as well