Running LLMs on the NPU of the Rockchip RK3588

Рет қаралды 5,382

LivingLinux

Күн бұрын

Пікірлер: 29

@ribeiro4642 4 ай бұрын

Obrigado pelo vídeo!

@Crftbt 5 ай бұрын

Is the NPU failing to complete with the 7billion model due to running out of memory? Is there a log file somewhere?

@LivingLinux 5 ай бұрын

I don't think it's running out of memory. According to htop, memory usage is around 72% and stable. Perhaps there is a reason that the NPU driver still doesn't have a version number at 1 or higher.

@devlogschannel 4 ай бұрын

hi, thanks for sharing great video, but is there anyway to fully use 3 cores of npu

@StasNsky 5 ай бұрын

What was the exact speed in t/s for the llama 2 on the NPU?

@JuanSanchez-rb4qu 3 ай бұрын

Cool, what board are you using?

@LivingLinux 3 ай бұрын

I have both the Radxa Rock 5A and 5B. And also some Mekotronics devices, but I mainly use the Mekotronics devices with Android.

@Alice8000 Ай бұрын

COOL

@Freshbott2 5 ай бұрын

Hi, sorry it's not really related to your video but did you compile uboot for this device? I'm at my wit's end trying to follow the Rockchip Wiki for uboot.

@LivingLinux 5 ай бұрын

No, I have never compiled uboot. Do you have a Radxa or Orange Pi board (or other)? It's probably better to ask in their forums. forum.radxa.com/ www.orangepi.org/orangepibbsen/

@Freshbott2 5 ай бұрын

@@LivingLinux I've got the FriendlyElec CM3588 and a lot of regret, as I don't want to be dependent on someone's Google Drive for OS support (now) or into the future. But thankyou though I'll see if someone's got more detail for an Orange Pi.

@jeremybub2 2 ай бұрын

Have you been able to run models at 4bit quantization on the NPU?

@LivingLinux 2 ай бұрын

I only use the models, I don't go into the technical details. But there are people with way more knowledge. Here is a link that says that it is possible to do int4 (not sure if that's what you are referring to), but last year the driver only supported int8 and float16. Not sure if the newer driver supports more. clehaxze.tw/gemlog/2023/07-13-rockchip-npus-and-deploying-scikit-learn-models-on-them.gmi

@아바바-p2s 4 ай бұрын

Hi. I think this content is Ubuntu on rk3588 and use rk3588's npu. If i use rk3568, can i use this source?

@LivingLinux 4 ай бұрын

It needs NPU driver 0.9.6. You can check it with this command: dmesg | grep -i rknpu

@아바바-p2s 4 ай бұрын

Thanks for response. But rk3568's rknpu driver version is 0.9.0. I tried uploading kernel , but it doesn't easy. Could you tell me what is your devlopment board?

@LivingLinux 4 ай бұрын

@@아바바-p2s I have the Radxa Rock 5B and 5A. I also have some Mekotronics devices, but I mainly use Android on them.

@peterwan816 2 ай бұрын

does LM studio works here? how does it perform?

@LivingLinux 2 ай бұрын

Judging from the LM Studio website, they are not supporting the Rockchip NPU. The Rockchip drivers are not exactly production ready.

@WonDong Ай бұрын

16GB Ram, correct? Anyone who tried on OrangePi instead?

@LivingLinux Ай бұрын

It seems the developer Pelochus has an Orange Pi. The amount of memory is only relevant for the size of models you can run. github.com/Pelochus/ezrknpu

@WonDong Ай бұрын

@@LivingLinux Thanks. Sorry I forgot to mention that LLAMA 7b model you ran seems consuming around 50% memory of 16GB in my question.

@АнтонКоролёв-о1н 2 ай бұрын

can you show how to serve it on a local network?

@timmturner 2 ай бұрын

It is running locally

@АнтонКоролёв-о1н 2 ай бұрын

@@timmturner and here is the problem

@LivingLinux 2 ай бұрын

I'm not much of a system administrator. Probably you can expose it through a web server, but that's not something I have ever done before. And it's not something that is high on my to-do list. Perhaps someone can easily do it through Pinokio AI, as I see that option very often with packages that are installed with Pinokio.

@timmturner 2 ай бұрын

@@АнтонКоролёв-о1н yeah I misunderstood your question, sorry about that.