How Arm CPUs Accelerates AI Workloads Without a GPU or NPU

Рет қаралды 2,761

Күн бұрын

This video shows how Arm CPUs, using Neon, SVE, SME, and Kleidi technology, are revolutionizing AI and machine learning, eliminating the need for dedicated NPUs or GPUs. I'll explore matrix multiplication optimization, highlight Arm's collaboration with Meta on its Llama LLM, and demonstrate the speed of a large language model running on an Android smartphone just using CPU acceleration. Developers, researchers, and anyone interested in high-performance AI on standard hardware should watch.
---
Unleashing the Power of AI on Mobile: LLM Inference for Llama 3.2 Quantized Models with ExecuTorch and KleidiAI - community.arm....
Arm Developer Hub: arm.com/dev-hub
#garyexplains

Пікірлер: 16

@shaxbee 11 сағат бұрын

Vector extensions on ARM are awesome. Tenstorrent is taking similar approach with RISC-V but with sole focus on vector instruction and data transfers.

@vin.k.k 11 сағат бұрын

The future is looking brighter.

@uk1589 11 сағат бұрын

5:02 nice showing of actual lamas there 😂 Please add more funny materials like this in the videos so that it gets interesting as well as already informative to watch.

@Saif0412 10 сағат бұрын

🤣

@mushkin92 11 сағат бұрын

Thx for your video but i can't see the links in the description ?

@GaryExplains 10 сағат бұрын

Ooops, sorry about that. I have added them now. The main Arm developer hub is here: arm.com/dev-hub

@laci272 5 сағат бұрын

@@GaryExplains Thank you, but how do we install/run the android version you showed in the video?

@sciencee34 9 сағат бұрын

How do I get this on my phone like that. Thanks

@paulturner5769 5 сағат бұрын

I would like to see AI used to go over all uploaded videos on KZbin and normalise the volume levels so that I am not changing them per video. Hint Hint! :-)

@PeterRince 8 сағат бұрын

So which cpu for mobile devices is more future-proof in 2025? Snapdragon 8 gen3 or Apple A18?

@tonysheerness2427 10 сағат бұрын

Nice to see an application running locally instead of always on the cloud. I am all for giving people the power not big tech companies.

@anonymouscommentator Минут бұрын

im pretty sure that running on a cpu is pretty much the definition of not being accelerated 💀 vector extensions like avx have been on x86 for a long time yet there are massive differences between smth like a 7950x and a 4090. your video is selling a false promise. running ai will always be slow on a phone and running it on the cpu will be even slower.