Apple M3 Machine Learning Speed Test (M1 Pro vs M3, M3 Pro, M3 Max)

Рет қаралды 206,770

Daniel Bourke

Күн бұрын

Пікірлер: 222

@nat.serrano 8 ай бұрын

Finally somebody explains this shit properly not like all the other youtubers that only use it to create videos

@AZisk Жыл бұрын

Nice! Missed you buddy!

@_MrCode Жыл бұрын

I'm sure this video touched your heart.

@anthonypenaflor 11 ай бұрын

I actually thought this was your video when it popped in my feed!

@AZisk 11 ай бұрын

@@anthonypenaflorI don’t have such a beautiful desk.

@aptbabs Жыл бұрын

Yo, it’s been a while I saw my teacher, nice to see again and good video by the way. More blessings bro.

@hyposlasher Жыл бұрын

So cool that you used 10 shades of green in your graphs. It's very convenient to distinguish

@mrdbourke Жыл бұрын

You’re right, I made a mistake here - I only really noticed this reviewing the video, I guess since I made it I could tell the difference. Next time the graphs will be easier to distinguish!

@hyposlasher Жыл бұрын

⁠besides that, the video is awesome and very informative

@jaimemedina3351 2 ай бұрын

you're a clown

@ExistentialismNeymarJunior 29 күн бұрын

@@hyposlasherswitch up is crazy

@hyposlasher 29 күн бұрын

@@ExistentialismNeymarJunior wdym?

@joejohn6795 Жыл бұрын

Please redo using MLX as that's what the developers using this laptop will probably be using.

@andikunar7183 Жыл бұрын

Especially since this week Apple released MLX with quantization support and other stuff.

@mrdbourke Жыл бұрын

Fantastic idea! I started with TensorFlow/PyTorch since they're most established. But MLX looks to be updating fast.

@miguelangel-nj8cq 11 ай бұрын

not even that much, it doesn't even come close to those who really use Tensorflow and Pytorch, besides that if you have your production environment in the cloud, those 2 libraries are better integrated than MLX, in addition to the fact that for quick deployments you already have the containers preconfigured and optimized with those libraries and CUDA since the cloud servers are dominated by NVIDIA and not Apple's "Neural Engine".

@modoulaminceesay9211 Жыл бұрын

Good to see you again you made machine learning and ai fun

@laiquedjeutchouang Жыл бұрын

Thanks, Daniels, for the video and for sharing the materials' links. You're a legend. Got an M3 Pro 14" (11-core CPU, 14-core GPU, 18GB) last month and have been wondering it was an optimal move.

@andikunar7183 Жыл бұрын

Surprised, that you did not include RAM bandwidth in the beginning. Whenever you do non-batched inference, the memory-bandwidth becomes your main constraint, instead of your GPU-performance. As shown in your M1 Pro to M3 Pro comparison. llama-cpp's M-series benchmarking shows really nicely, why the M3 Pro with it's 150GB/s instead of 200GB/s memory is a problem, not its (faster) GPUs. If one just does inference and has large models, requiring lots of RAM, the M2 Ultra really shines with its loads of 800GB/s RAM. Totally agree, that with learning and batching, it's different and NVIDIA's new GPU performance blows away Apple silicon.

@imnutrak130 Жыл бұрын

that's KZbin quality education, good enough but most of the times they are missing crucial details and due to that mistake twisting the truth especially for performance. Although this person has studied and gets paid BIG salary to know such details ..... weird but I boil it down to maybe a simple human mistake. Still a good video!

@mrdbourke Жыл бұрын

Woah, I didn't know about the lower memory bandwidths between the M1/M3. Thank you for the information. I just wanted to try raw out-of-the-box testing. Fantastic insight and thank you again.

@gaiustacitus4242 7 ай бұрын

nVidia's GPU performance falls on its face once the LLM's size exceeds the video card's onboard RAM.

@andikunar7183 7 ай бұрын

@@gaiustacitus4242 yes, but you can split layers to multiple cards. For me, I decided for a M2 Max 96GB MacStudio and not for a 1kW+ heater PC, even though in pure GPU horsepower the 4090 is much faster. And never regretted it. Correction - I now regret my M2 Max decission since last week, because Apple/MacOS Sequoia finally will do nested Virtualization. But only on M3 and above. And with this I have hopes of virtualized GPUs at some time. Nvidia/CUDA always was virtualize-able and works in Docker-containers/VMs.

@gaiustacitus4242 7 ай бұрын

@@andikunar7183 Even with two nVidia 4090 GPUs a 70B parameter LLM will still yield lower performance than a high-end M-series Mac.

@Joe_Brig Жыл бұрын

If portability isn't a requirement, then the Mac Studio Ultra should be considered with its 60 GPU cores and 800GB/s memory bandwidth.

@m_ke Жыл бұрын

You missed memory bandwidth, the M1 pro has higher bandwidth than the non Max m3 macbooks.

@mrdbourke Жыл бұрын

Thank you! I didn't know this. Very strange to me that a 2 year old chip has higher bandwidth than a brand new chip.

@WidePhotographs 10 ай бұрын

In the process of learning ML/Ai related tasks. Based on your experience would you prefer a 13” MBP M2 24GB RAM ($1,299 new) or a 14” MBP M3 Pro 18GB RAM ($1,651 used)?

@mrdbourke 10 ай бұрын

The 24GB of RAM would allow you to load larger models. But it also depends on how many GPU cores the two laptops have. Either way, both are great machines to start learning on

@Maariyyaa-i8f 6 ай бұрын

you’re a great teacher! extremely clear

@SanjiPinon Жыл бұрын

Hey Daniel, Consider trying their MLX versions as some of the models enjoy performance gain as high as 4x compared to their torch counterparts

@siavoshzarrasvand Жыл бұрын

Does MLX work with Llama 2?

@SanjiPinon Жыл бұрын

@@siavoshzarrasvand yup and much much faster than llama.cpp

@francescodefulgentiis907 Жыл бұрын

this video is exactly what i was searching for, thank you so much for proving such clear and usefull information.

@junaidali1853 Жыл бұрын

Appreciate hardwork. But please consider using better color scheme for bars. They all look the same.

@Yingzhe-mi6 Жыл бұрын

Hi Daniel, will you be teaching something more than image classification? You are the best programming teacher I have ever followed. Looking forward to your new deep learning course on ZTM.

@SJ-ds8lp 3 ай бұрын

Thanks mate so much for doing this.

@_MrCode Жыл бұрын

Glad to see you back.

@networktalha Жыл бұрын

OH FINALLY, waiting for that. U are king bro

@andresuchitra 5 ай бұрын

Thank you Daniel, a thorough research for ML engineers. This research is worth a conference session 💪

@alibargh 10 ай бұрын

Excellent comparison, thanks 😊

@bonecircuit Жыл бұрын

Thanks a lot for the valuable information. You saved me a tonne of time to come to a conclusion. cheers mate

@EthelbertCoyote Жыл бұрын

One thing is clear even as a PC person Mac had a steep advantage with M3's dynamic ram to vram conversion and mow power. Sure they don't have the hardware or software of nVidia but for some Ai users, the entry price for the VRam is a winner.

@denis.gruiax 3 ай бұрын

Many thanks for the video 🙏

@krishnakaushik4294 Жыл бұрын

Happy Christmas Sir❤❤

@mrdbourke Жыл бұрын

Happy Christmas legend!

@DK-ox7ze 11 ай бұрын

I believe you can also target pytorch to run on Apple silicon's NPU rather than the GPU. And I am sure it will perform better. Though not sure about how much memory the NPU has access to. It will be great if you can explore this and do a video on it.

@DamianLesiuk 2 ай бұрын

So macbooks are shitty for AI?

@jaimemedina3351 2 ай бұрын

This is excellent!

@Rithvik1-v3i 6 ай бұрын

Thankyou for the video.

@aisven822 2 ай бұрын

Nice! Could you add some insights regarding thermals and throttling on the Macs?

@ericklasco 10 ай бұрын

Thank you for the Knowledge it really gave me an insight.

@alexpascal5403 6 ай бұрын

nobody likes me. but that’s okay! i’ll suqqOn it anyway. !!! i’ll suqqOn it anyway !!!! feed me a lemon bro. give me your lemon dude.

@kpbendeguz 11 ай бұрын

Would be interesting to see how the 128GB version of M3 Max performs compared to the RTX cards on very large datasets, since 75% ~ 96GB could be used as vram in that Apple Silicon.

@joesalyers 4 ай бұрын

When you have the same chip you will hit the silicon lottery and one machine will have a better GPU while the other will have a better CPU depending on dead transistors and little lottery based differences. So I'm not surprised that an M# pro and M3 Max with the same Neural engine will perform differently. The Silicon lottery is a real thing that will always be a factor in computing. Great video by the way and very informative.

@ri3469 Жыл бұрын

This is interesting! It seems between the m3 pro 16GB (150GB/s) and m3 max 32GB (400GB/s), and considering the m1 pro 32gb (200GB/s), would you suggest that RAM is a much important factor to these ML tasks than memory bandwidth? Or other? Would be keen to see a test between an m3 pro 32gb vs your m1 pro 32gb to see if memory bandwidth of 50GB/s difference has any real world result differences. (also one less GPU core but faster boost in M3 pro)

@altairlab4876 Жыл бұрын

Hi Daniel! What a great PyTorch tutorial you have made. Thanks for that! Also thanks for that speed comparing video. Can you record the video that comparing the speed of different Colab versions? I mean free, 10$ and 50$. Also here can be added M3 max and your Titan (which you already have done). Maybe one of your friends has 50$ account and he can do that tests for you [for all of us :)]

@petchlnwzaaa 3 ай бұрын

Thanks for making such insightful deep-dive videos about these M Chips. I wonder if Apple will ever open their NPU to more APIs. Right now, MLX is starting to gain traction in the open-source ML community, but it still can't tap into the Neural Engine for inference, so we’re still stuck with a slower GPU for Macs. The M4 chip already has a Neural Engine that can compute 38 TOPS, and it's just sitting there doing nothing while the GPU does all the work when running ML inference. It would give Macs a huge boost in ML performance if they opened that up.

@juanfgd 3 ай бұрын

The advances in MLX warrant an update to this great video! It's getting REALLY good at performing some tasks, and I'd love to see how all these machines perform with MLX framework, since on my iPhone 15 Pro I'm now able to run Llama3.1-8B-q4 at around 8.5 toks/sec, Llama3.2-3B-q4 at 20 toks/sec, and Llama3.2-1B-q4 at a whopping 49 toks/sec, something impossible just a few months ago!

@furkanbicer154 Жыл бұрын

Can you also make comparison with Neural Engine of M processors?

@dkierans Жыл бұрын

Outstanding

@jplkid14 Жыл бұрын

Why didn't you compare M1 max?

@tty2020 Жыл бұрын

your M1 Pro RAM is about twice that of your m3 pro, so maybe that is why it performs better than the latter.

@mrdbourke Жыл бұрын

Yeah you're right, I also just found out that M1 has a higher memory bandwidth than the M3 (150gb/s vs 200gb/s) thanks to another comment. That likely adds to the performance improvement on the M1. Strange to me that a 2-year-old chip can comfortably outperform a newer chip.

@JunYamog Жыл бұрын

I have only a 16 GB M1 Pro, on the first 2 benchmark I get similar or slightly faster speeds. I will try to run them on the other benchmarks, I got side tracked modifying the 1st benchmark to run on a quad RTX 1070 setup.

@IviAxm1s7 9 ай бұрын

It's a good idea going for a new M3 MacBook Air model with 16GB for starting to learn ML?

@mrdbourke 9 ай бұрын

Yes that would be a perfect laptop to start learning ML. You can get quite far with that machine. Just beware that you might want to upgrade the memory (RAM) so you can use larger models.

@Rithvik1-v3i 6 ай бұрын

@@mrdbourke sir I am having M3 Air 16 GB and Macbook Pro M3 Pro 18 GB What should I go for, if I am starting to learn and grow in ML in long term and the price difference between both is 30,000 /- please adive, thanking you

@sam-bm1fg 5 ай бұрын

@@Rithvik1-v3i You dont need such heavy powered machines to start learning ML. Just use google collab to learn. May be then once you implement projects you will understand which is better.

@waleedt_trz Ай бұрын

@@Rithvik1-v3i I'm planning to get the M3 Air with 16Gs of RAM

@krishnakaushik4294 Жыл бұрын

Sir I follow all of your blogs, vedios etc I want to be a ML Engineer so i enrolled in your 'Complete ML and Data Science course on ZTM'. What a marvellous way of teaching ❤❤

@oddzc Жыл бұрын

Your tests just prove how bullcrap synthetic benchmarks are. Love your work.

@digitalclips 11 ай бұрын

I'd love to see you test the M3 Ultra with 64 GB RAM when it comes out, I am using the M2 Studio Ultra at present and wonder if it will be worth upgrading. Running batches, it gets warm, but I've never heard its fan yet.

@kn7x802 2 ай бұрын

Thanks for the video & nice in-depth comparisons. I thought GPUs are for game playing and M series had dedicated multicore neural engines.

@shinobi339 2 ай бұрын

Thanks

@samsontan1141 11 ай бұрын

Great video, could you please update us if the new mlx change the result or your conclusion at all? Would love to know if the m series chip is as good as what the others are saying .

@sathiyanit Жыл бұрын

Very good one. Thank you so much.

@paulmiller591 Жыл бұрын

Very helpful thanks Daniel. I was going to race out and buy an M3 to do my ML work, but I will hold off for now. I suspect Apple will do something to help boost performance considerably on the software side, but who knows.

@UrbanGT 10 ай бұрын

Thanks!

@mrdbourke 10 ай бұрын

You’re welcome!

@luizconrado 5 ай бұрын

What is the name of this monitoring tool he is using in the terminal?

@TheMetalMag Жыл бұрын

Great video

@shubhamwankhade628 11 ай бұрын

Hi Daniel love your video, can you please suggest which laptop is good for deep learning mac or windows or linux

@YuuriPenas Жыл бұрын

Great job! Would be great to include some popular Windows laptops as well in the comparison :)

@haralc 2 ай бұрын

Can you try the same tests on M3 Ultra with 196GB RAM?

@JoseMariArceta 6 ай бұрын

So cool, are you able to run these tests on a m3 max chip with a maxed out ram configuration? Could it be more "usable" than say a 4090 with "only" 24gbs of dedicated vram?

@azrinsani 3 ай бұрын

I have the exact same Macbook Pro 32GB 16 Core GPU !!!! Wondering, will running this fry your macbook?

@duydang5101 6 ай бұрын

Thank you. M1 Pro is so good.

@synen Жыл бұрын

These machines are great as laptops, for desktops, Intel 14th Gen i9 plus Nvidia GPU smoke them away.

@gaiustacitus4242 7 ай бұрын

For small LLMs, you are correct. For 13B parameter or larger LLMs a maximum spec'd Mac Studio M2 Ultra or MacBook Pro M3 MAX will outperform the best Windows-based solution you can build. Of course, the new Copilot+ PCs running Snapdragon X Elite CPUs will also outperform the desktop build you've recommended when running 3B to 3.8B parameter LLMs.

@icollecteverything 11 ай бұрын

You posted the single-core CPU scores for the M3 Macs, that's why they are all the same pretty much.

@tybaltmercutio 11 ай бұрын

Could you elaborate on that? Are you referring to the ML timings?

@dimitris7368 3 ай бұрын

1) How do you exactly SSH to your remote NVIDIA setup? Via VS code? 2) For a remote NVIDIA setup, is Windows ok or should it be linux-based?

@jolieriskin4446 Ай бұрын

I think the misunderstanding is generally Base/Pro/Max is the performance spread. M1/M2/M3/M4 yes, do have mild improvements each successive generation, but an M1 Max will still probably outperform a base M3/M4.

@HariNair108 Жыл бұрын

I am planning to buy M3 pro. Which one should i go for 30 core GPU or 40cire gpu. My use will be around running some prototype models in LLMs.

@levelup2014 Жыл бұрын

I wish you would make videos covering AI news your probably more qualified to talk about new developments in this space then 80% of these “AI channels”

@aronreis 3 ай бұрын

Very nice comparison. The colors from labels didn't helped much to understand which is which though.

@dr.a.o. Жыл бұрын

~$3000 for that deep-learning PC seems super cheap. It will cost double the price where I live...

@franckdansaert Жыл бұрын

is your test is using the Mx GPU : are TensorFlow and Pytorch optimized for Apple GPU silicon ?

@VeltairD 2 ай бұрын

PLEASE update for M4

@JunYamog Жыл бұрын

Thanks for this, really useful and confirms my initial thoughts on just getting an M1 Pro 16GB over M3 8GB (M1 Pro is slightly cheaper). My M1 Pro is similar to yours 10 cpu + 16 gpu but just 16GB and has been slightly faster on both pytorch benchmarks. I then was curious to see how it compares to a quad RTX 1070. I modified your code (I will make a PR) to use all four GPU for CIFAR100. In general it is faster than the M1 Pro, what is interesting is how it compares to single card vs quad cards. CIFAR100 on small batch it was really bad, however by 512 batch size it was faster than a single card (34 secs on 1024 batch). It keeps on improving until 3072 with 16 secs, then gets worse at 4096 back to 19 secs similar to 2048. Also by 4096 batch size the GPU VRAM is almost full and close to 8GB.

@imnutrak130 Жыл бұрын

7B parameters / 25 ( 25 and delete 7 zeroes or divide by 250 000 000) = 28GB which is close enough for a simple maths for GB Memory for Molde Parameters.

@samarpanacademy3813 Ай бұрын

Nice video sir, I am buying MacBook book m4 max hoping that I could do data science, AI and machine learning easily. Would you please recommend how much ram I might need on this m4 max so that I don’t need to invest again at least for 5 years? Moreover 14 inch can handle it for your preference or I might need 16 inch. What would be your best suggestion to invest on 35 gb m4 max and use amazon cloud for higher deep learning or ? . It would be my pleasure if you could respond on it sooner.

@doesthingswithcomputers Жыл бұрын

I would like to see if you have utilized intels gpus.

@garthwoodworth3558 Жыл бұрын

Question: I bought the M1 max with 64 GB ram, and 32 cores GPU. Like you, I am now extremely satisfied with my purchase two years later. Question: I like your set up using the Apple machine in conjunction with a box with that RTX4090 installed. Would that set up run in parallel with my GPU course? And similarly, if I added equivalent ram to that box, would it work together with my installed 64 GB?

@nadtz Жыл бұрын

The M3 Pro being slower/not much faster in some tests is probably because of the slower ram. I'd be interested to see how 30 and 40 series cards stack up but considering the cost of the laptops already this is quite the effort so no complaints.

@kborak Жыл бұрын

my 6750xt will beat these things lol. You macboys are so lost in the woods.

@nadtz Жыл бұрын

@@kborak I'm not a mac user, I wouldn't buy Apple hardware for love or money. But the chips are still pretty good so it's interesting to see how they stack up to a better GPU for this kind of workload.

@godofbiscuitssf 11 ай бұрын

At one point you say the bottleneck is memory copies from CPU to GPU and back, but the M-series doesn't have to do memory copies because it's all shared memory. In fact, one of the first optimizations for code on Apple Silicon is removing all the memory copying code because it's an easy gain. Have you accounted for this in either your code or the library code you're using, or both?

@g.s.3389 Жыл бұрын

what are the parameters that you used for powermetrics? I liked the monitoring you had in terminal.

@mrdbourke Жыл бұрын

I used the asitop (github.com/tlkh/asitop) library for monitoring in terminal

@joshbasnet3014 11 ай бұрын

Do you have a masters/phd degree on ML ? Does your job require data science degree?

@PhillipLearnTeach Жыл бұрын

Are they faster than Intel Ultra Core 155 or 185 ?

@ahmedrahi9775 11 ай бұрын

The comparison between the M1 Pro and M3 Pro is not ideal. The M3 pro you are testing is the binned version with only 14 cores however your comparing it too the full M1 Pro. To get an accurate performance measurements its best to measure both the full chips rather than the binned version that way we can truly see if the memory bandwidth has any difference when it comes to Machine learning

@andyparker8631 Жыл бұрын

Would be very interesting to normalise the results based on cost of hardware, after all it always comes down to spend!

@intptointp 11 ай бұрын

Hm, in my opinion, a strange metric because "effectiveness per dollar" doesn't really tell you much. My bike costs $300 and my car cost $10000. My bike averages around 20 mph and my car 75 mph. That comes out to 30x the price for 4x the speed. Did this tell you anything? In my opinion, no. What is a far more useful metric is the options the purchase makes available to you. If I have a car, traveling 10 miles for food is a very easy decision to make. If I only have a bike, traveling 10 miles is a major decision. With the right hardware, you unlock options like "iterative experimentation" whereas before, you had to carefully choose your workloads. And as he mentions, certain configurations simply lock you out of certain desired avenues. (8 GB of RAM is too little for many projects.) So yeah... spend is not a very useful metric, in my opinion. Choosing the bike over the car is a pretty pricey choice for reasons beyond money.

@aparkeruk 11 ай бұрын

Interesting analogy, but with the car many other features (in the warm, carries 4 people....). When buying compute power for AI then yes you could also consider laptop might be better than desktop for convenience, but not really like the car example. If you were comparing mainframe to laptop to desktop then might be nearer this analogy. Guess will not matter soon, as cheapest will be cloud purely by volume!@@intptointp

@CykPykMyk 10 ай бұрын

how come m3 max is slower than m3 regular, and m3 pro in PyTorch test?

@javierwagner4410 11 ай бұрын

I think it would be interesting if you standardized your measures by memory and number of cores.

@m_codes 4 ай бұрын

Bought Nvidia RTX 4070 Ti Super.This video was very helpful.

@nicolaireedtz8015 Жыл бұрын

can ypu try llma 2 70b with 128gb ? m3 max

@haqkiemdaim 11 ай бұрын

Hi sir! Can you suggest the proper/recommended way to install tensorflow in macbook?

@woolfel Жыл бұрын

from my experience, tensorflow optimization is a little better than pytorch for convolutional models.

@valdisgerasymiak1403 Жыл бұрын

IMHO macbooks are only inference machines, not training. It's great to run locally 7B, 13B, 30B LLMs (depends of your # of RAM), run quick stundents training on something like MNIST. I personally write code for training and run experiments with small batch size on my M1 pro, than copy the code on my 3090 PC and run long training with bigger batch and fp16. While PC is busy, I run next experiments in paralle on laptop. If you load with big training your main laptop, you will have uncomfortable experience if you want browsing, gaming, etc in parallel with training.

@zigzogzugzig 11 ай бұрын

.... and what about M2 pro mac ?

@betobeltran3741 5 ай бұрын

I am a medical doctor with a recently acquired Ph.D. in pharmacology. I am currently engaged in clinical research, focusing on identifying factors that lead to therapeutic failure in patients with various conditions. My work involves analyzing patient data files that include sociodemographic information, pathological records, clinical data, and treatment details. These datasets typically contain between 100 and 2,000 variables per patient, with a maximum of 1,000 patients in an ideal scenario. I will be using R and RStudio to process and analyze this data in various ways. Based on your experience, could you suggest a computer configuration capable of handling this type of data processing efficiently? Thanks in advance!

@fVNzO 2 ай бұрын

R is such a light workload that any M series device would handle that for you no problem. Without actually specifying what you mean by analyze and what kind of procedures you will do to the data, I figure it's going to be something very light and easy to run (otherwise you would have surely specified it). So anything, absolutely anything will do here. Get an M4 chip with the single core, either a base model M4 macbook pro, or wait for an air if you can handle their screens.

@karoliinasalminen Жыл бұрын

Would you be able to benchmark maxed out Mac Studio with M2 Ultra, 192 GB ram and 76 GPU cores against the nVidias?

@RichardGetzPhotography 11 ай бұрын

M series doesn't allow for external GPUs so how do you hook a 4090? This would make a good video.

@mineturtle12321 Жыл бұрын

Question: how did you get pytorch / tensorflow on the m3 max chip? There is no current support?

@prathamjain8682 4 ай бұрын

can you make for intel ultra 7 155h

@v3rlon Жыл бұрын

There is a CoreML optimization for PyTorch on Apple Silicon. Was this used?

@greenpointstock6345 Жыл бұрын

Do you have more details on this? I've looked for something like this before and all I can find is something that seems to let you convert PyTorch to CoreML, or info on Pytorch using the. GPUs but not ANE. But I probably am missing something!

@PMX 10 ай бұрын

Both the M3 Pro and the M3 Max you tested have lower bandwidth than the previous M1/M2 Pro / M1/M2 Max and since bandwidth is hugely important that was reflected in your results. The M1/M2 Pro have a 200 GB/s whereas the M3 Pro only has a 150 GB/s. The M1/M2 Max have a 400 GB/s bandwidth but the M3 Max model you chose only has a 300 GB/s bandwidth (there are also M3 Max models with 400 GB/s).

@PMX 10 ай бұрын

As an example, I get 30% faster inference speed on my M2 Max (400 GB/s memory bandwidth) as you got with the base M3 Max (300GB/s bandwidth).

@mrdbourke 10 ай бұрын

Wow! I didn’t even know this… excellent info. So what makes the bandwidth increase from the base models? Is it RAM upgrades or storage? Or something else?

@mrdbourke 10 ай бұрын

@@PMX makes sense!

@PMX 10 ай бұрын

@@mrdbourke The M3 Max with a 30 core GPU has a 300 GB/s bandwidth and the one with a 40 core GPU has a 400 GB/s bandwidth

@mrdbourke 10 ай бұрын

@@PMX woah so the 40 cores is worth the upgrade. Is this information on Apple’s website? I must’ve missed it

@Rowrin 10 ай бұрын

Also worth noting that the GPU on a macbook only has access to 75% of the unified memory.

@ryshask 10 ай бұрын

On my m1 max 64GB... I'm getting 8208 on Core ML Neural Engine... My Core ML Gpu falls more in line at 6442... All this while powering 3 screens. Watching youtube and a twitch stream. Not that I expect those things to add much load... But it is nice to have a machine that can basically do everything at once with near zero penalty.

@nocopyrightgameplaystockvi231 Жыл бұрын

Rtx 4090 has better Tensor cores, so that's hard to compete even with a M3 Max.

@andikunar7183 Жыл бұрын

not for pure non-batched inference, where the memory-bandwith as well as memory-size is the main constraint. There the M2 Ultra's 800GB/s vs. 4090 1080GB/s is not so bad. The higher GPU-power of the 4090 really shines with batched processing.

@RandyAugustus 10 ай бұрын

Finally a useful video. Too many “reviews” focus solely on content creators. Now I know I can do light ML on my Mac. And do the heavy lifting with my 30 series RTX card.