Running YOLO (Yolov8, Yolov5, Yolov6, YoloX, PPYolo) on RockChip NPU (RK3566, RK3568,RK3588, RK3576)

Рет қаралды 9,107

Күн бұрын

My LinkedIn - / maltsevanton
My Twitter - / serious_wk
My Telegram channel - t.me/CVML_team
Update after RKNN-Toolkit 1.6.0 - • Detection, classificat...
00:00:00 - Intro
00:01:46 - What system should I use?
00:03:38 - Logic of the pipeline
00:04:50 - First step. Set up the system.
00:11:19 - Second step. Set up RKNN YOLOv8 export on a host machine.
00:13:49 - Third step. Set up Rknn-Toolkit on a host machine. Export net in RKNN format.
00:20:01 - Fourth step. Run the network on the RockChip device.
Code Snippet - gist.github.com/ZlodeiBaal/84...

Пікірлер: 53

@AntonMaltsev 6 ай бұрын

IMPORTANT!! RockChip updated the structure of RKNN-Toolkit and ModelZoo completely from 1.6.0. Here is the video with update - kzbin.info/www/bejne/jJvQn3tvZpWZl8U

@user-tr3bo2kq3h 8 ай бұрын

I followed your step and it works well on orangepi5 plus board. Thanks a lot.

@wolpumba4099 5 ай бұрын

*Intro* - 0:01: The speaker discusses the increasing popularity of Rockchip in the computer vision world. - 0:15: Acknowledges the popularity of Rockchip and mentions addressing two common problems in this video. - 0:23: Highlights changes and updates in the Rockchip ecosystem over the years. - 0:47: Promises an up-to-date guide as of October 2023 to address running different YOLO models on Rockchip NPUs. *System Setup and Issues* - 1:47: Introduces the system used, Rock P ER car 3568, and mentions testing on a different board. - 2:57: Discusses potential problems like driver installation, using sudo commands, and NPU detection issues on specific boards. - 3:24: Acknowledges the guide may not be universally excellent for all boards and recommends adjusting for specific cases. *Logic of YOLO Network Setup* - 3:38: Introduces the logic of preparing a YOLO network for Rockchip NPUs. - 3:46: Emphasizes the need to understand the process: preparing, exporting, and running the network on the board. *Step 1: Set Up the System* - 4:49: Describes the first step, which involves SSH connection, using a script to set up the system on an empty board. - 5:12: Advises modifying parameters in the script based on the specific board and system. - 7:17: Stresses the importance of updating Python versions, with a focus on Python 3.7 in this case. *Step 2: Set Up RKNN YOLOv8 Export* - 11:25: Introduces the second step, setting up the YOLOv8 rapper from Rockchip for exporting the model. - 13:33: Describes the process of setting up the YOLOv8 repository, creating a new environment, and running the export script. *Step 3: Set Up Rknn-Toolkit* - 13:54: Introduces the third step, setting up Rknn-Toolkit on an x86 machine for exporting the model in ARANN format. - 15:16: Highlights the complexity of environment setup, particularly for compatibility with specific Python versions and GCC. *Step 4: Run Network on RockChip* - 20:08: Details the fourth step, running the network on the Rockchip device after successfully exporting the model. - 20:31: Discusses modifications needed in the code to ensure compatibility with Rockchip and the successful execution of the network. *Modifications to YOLOv8 Code* - 25:09: Discusses reshaping in the code and the use of a specific function. - 25:22: Mentions making modifications in the future and moves to the next point. - 25:40: Talks about additional adjustments in the code. - 25:49: Addresses a specific issue in the code, possibly an error. - 25:56: Indicates a need for fixing something in the future. *Translation and Code Modification* - 26:03: Introduces the task of translating the code to English. - 26:10: Begins the translation process and addresses the first modification. - 26:16: Mentions being at a specific point in the code. - 26:23: Discusses running a particular section of the code. - 26:29: Highlights the need for translation to English. - 26:38: Mentions a lot of modifications that need to be made. - 26:46: Addresses a potential error related to model path. - 26:56: Refers to specifying the target in the official documentation. - 27:02: Talks about specifying the target for the system and discusses the use of default values. - 27:11: Emphasizes the importance of specifying certain parameters. *Testing and Additional Code Modifications* - 27:17: Discusses the need to specify a particular ID. - 27:25: Mentions using default data and displaying results. - 27:33: Verifies that necessary modifications have been saved. - 27:44: Initiates a test of the code. - 27:50: Discovers a potential issue and considers the deletion of the torch. - 28:00: Realizes that the torch has been deleted and addresses the issue. - 28:10: Corrects the model path error. - 28:26: Specifies the correct path for the expert model. - 28:41: Continues with additional modifications. - 28:52: Makes another adjustment in the code. - 28:58: Initiates further changes. - 29:23: Refers to an application and confirms everything is working. - 29:32: Describes the image display through the interface. - 29:39: Acknowledges the functionality of the code. - 29:46: Comments on the legitimacy of the results. - 29:52: Addresses the potential saving location of results. - 29:59: Considers the need to specify a specific folder for saving. - 30:07: Expresses uncertainty about the saving process. - 30:13: Acknowledges the current functionality without modifications. - 30:19: Mentions the necessity of adjustments for better performance. - 30:26: Confirms the successful conversion of the code. - 30:33: Discusses final adjustments needed at runtime without parameters. - 30:40: Indicates the conclusion of the code-related discussion. Disclaimer: I used chatgpt3.5 to summarize the video transcript. This method may make mistakes in recognizing words

@EdjeElectronics 5 ай бұрын

Great work man! Keep it up!

@AntonMaltsev 5 ай бұрын

Appreciate it! Thank you!

@StasGT 2 ай бұрын

Красавчик!

@StasGT 2 ай бұрын

P.S. Хочу прикупить Апельсин 5Про, у него LPDDR5 стоит, должен быть по-шустрее. Я так понимаю, Рокчиповское НПУ поддерживает слои Внимания. Это круто, т.к. Гугл Корал не умеет. Хотя... Матричное умножение в лоб, делать может, соответственно, написать класс трансформатора не составит труда. Хм... Почему этого еще не сделали...? Займусь на днях, DeTr запущу на Корале.

@AntonMaltsev Ай бұрын

Корал очень старый. Ну, и там много различия в архитектуре...

@user-ew1hl5my8g 7 ай бұрын

Hi Anton Do you have the same video for Jetson Nano / ubuntu 20.04 with yolo detector ?

@stelioskoroneos3872 8 ай бұрын

Thanks for the video. Do you have any idea how RokChip NPU would perform compared to RPie+Google Coral and/or Nvidia Nano on YOLO?

@AntonMaltsev 8 ай бұрын

For the YOLOv5 there are a few of my tests here - medium.com/@zlodeibaal/choosing-computer-vision-board-in-2022-b27eb4ca7a7c

@LJC-zl4ye 6 ай бұрын

The rknn_model_zoo library has been recently updated with the content of yolov8-seg, but when I tested it, I found that the post_process function in it will take a long time, and the effect of modifying the dfl function according to the video method is not obvious, may I ask how the post_process function should be modified to improve the calculation speed? Here is the content of post_process function, thanks. def post_process(input_data): # input_data[0], input_data[4], and input_data[8] are detection box information # input_data[1], input_data[5], and input_data[9] are category score information # input_data[2], input_data[6], and input_data[10] are confidence score information # input_data[3], input_data[7], and input_data[11] are segmentation information # input_data[12] is the proto information proto = input_data[-1] boxes, scores, classes_conf, seg_part = [], [], [], [] defualt_branch = 3 pair_per_branch = len(input_data) // defualt_branch for i in range(defualt_branch): boxes.append(box_process(input_data[pair_per_branch * i])) classes_conf.append(input_data[pair_per_branch * i + 1]) scores.append(np.ones_like(input_data[pair_per_branch * i + 1][:, :1, :, :], dtype=np.float32)) seg_part.append(input_data[pair_per_branch * i + 3]) def sp_flatten(_in): ch = _in.shape[1] _in = _in.transpose(0, 2, 3, 1) return _in.reshape(-1, ch) boxes = [sp_flatten(_v) for _v in boxes] classes_conf = [sp_flatten(_v) for _v in classes_conf] scores = [sp_flatten(_v) for _v in scores] seg_part = [sp_flatten(_v) for _v in seg_part] boxes = np.concatenate(boxes) classes_conf = np.concatenate(classes_conf) scores = np.concatenate(scores) seg_part = np.concatenate(seg_part) # filter according to threshold boxes, classes, scores, seg_part = filter_boxes(boxes, scores, classes_conf, seg_part) zipped = zip(boxes, classes, scores, seg_part) sort_zipped = sorted(zipped, key=lambda x: (x[2]), reverse=True) result = zip(*sort_zipped) max_nms = 30000 n = boxes.shape[0] # number of boxes if not n: return None, None, None, None elif n > max_nms: # excess boxes boxes, classes, scores, seg_part = [np.array(x[:max_nms]) for x in result] else: boxes, classes, scores, seg_part = [np.array(x) for x in result] nboxes, nclasses, nscores, nseg_part = [], [], [], [] agnostic = 0 max_wh = 7680 c = classes * (0 if agnostic else max_wh) ids = torchvision.ops.nms( torch.tensor(boxes, dtype=torch.float32) + torch.tensor(c, dtype=torch.float32).unsqueeze(-1), torch.tensor(scores, dtype=torch.float32), NMS_THRESH) real_keeps = ids.tolist()[:MAX_DETECT] nboxes.append(boxes[real_keeps]) nclasses.append(classes[real_keeps]) nscores.append(scores[real_keeps]) nseg_part.append(seg_part[real_keeps]) if not nclasses and not nscores: return None, None, None, None boxes = np.concatenate(nboxes) classes = np.concatenate(nclasses) scores = np.concatenate(nscores) seg_part = np.concatenate(nseg_part) ph, pw = proto.shape[-2:] proto = proto.reshape(seg_part.shape[-1], -1) seg_img = np.matmul(seg_part, proto) seg_img = sigmoid(seg_img) seg_img = seg_img.reshape(-1, ph, pw) seg_threadhold = 0.5 # crop seg outside box seg_img = F.interpolate(torch.tensor(seg_img)[None], torch.Size([640, 640]), mode='bilinear', align_corners=False)[0] seg_img_t = _crop_mask(seg_img, torch.tensor(boxes)) seg_img = seg_img_t.numpy() seg_img = seg_img > seg_threadhold return boxes, classes, scores, seg_img

@AaquibNiyama-k1s Күн бұрын

Which camera we can use for Orange Pi 5B? Can we use the RPi Camera module?

@adityasaiakella1287 7 ай бұрын

does it work on rockchip rv1103?

@gabrielnilo6101 8 ай бұрын

what do you recommend for computer vision (CV) with $50 limit? And what do you recommend for CV with $150 limit?

@AntonMaltsev 8 ай бұрын

It depends on the task and skills. for 50$ So, for someone who is not familiar with embedded and doesn't need high performance, I can recommend old RPi - thepihut.com/products/raspberry-pi-3-model-a-plus A lot of guides and a small amount of bugs. But pretty slow. For someone who is familiar with embedded and neural networks - RockChip 3568 looks nice. There were ~35$ boards a year ago. Don't know if it's still in production. Also, there are a lot of Arduino-like boards -------------- For 150$ - a lot of boards availiable. Should consider resourses and task.

@AntonMaltsev 6 ай бұрын

IMPORTANT!!! The video is for rknn-toolkit2 version 1.5.2 and below. They updated the structure completely from 1.6.0. Hope I will write a new video soon.

@upsangelhk 6 ай бұрын

Thank you for the head up, surely looking forward to that.

@gundanium 8 ай бұрын

Have you been able to get yolo to run on the rockchip with C++ and not through python?

@AntonMaltsev 8 ай бұрын

The model obtained with such an approach should run on C++. The only problem is that you need to re-write NMS and all the other functions on the model output.

@ocamlmail 7 ай бұрын

Благодарю, как обычно, за видео. Но звука нет и, наверное, не будет. Т.е. очень тихо.

@benx1326 9 ай бұрын

What are the performance of the rk3588 on yolov5s and yolov8n

@AntonMaltsev 9 ай бұрын

Didn't test it, sorry. Also, when you ask such questions, you'll need to specify the size of the image. 224*224 vs 640*640 give you almost 10x different speed. You can check my results for rk3568 here: medium.com/@zlodeibaal/choosing-computer-vision-board-in-2022-b27eb4ca7a7c kzbin.info/www/bejne/iZLHnqpsh9edZ7s

@mrriverhe4768 6 ай бұрын

can you provide blog?

@ARMENIA181 9 ай бұрын

if modele traind for 640, is it possible to run camera 1280x736? Thnaks.

@AntonMaltsev 9 ай бұрын

There are a few approaches to this: 1) Reshape input to 640 size 2) Export model in 1280x736 However, the best practice is to use a model with the same parameters you used when training it.

@halilozcan8 9 ай бұрын

what would be performance in realtime ?

@AntonMaltsev 9 ай бұрын

I tested a few networks a year ago. All result in my previous video or in medium article: medium.com/@zlodeibaal/choosing-computer-vision-board-in-2022-b27eb4ca7a7c kzbin.info/www/bejne/iZLHnqpsh9edZ7s

@LJC-zl4ye 7 ай бұрын

Is there any way to increase the inference speed of rknn models on python? Currently NPU usage is a bit low, how can we increase it?

@AntonMaltsev 7 ай бұрын

I did not experiment with this. There are some samples with multithreading - github.com/leafqycc/rknn-multi-threaded (Python) - github.com/leafqycc/rknn-cpp-Multithreading (C++) and this github.com/thanhtantran/rknn-multi-threaded-3588 (C++). Maybe a few parallel processes can speed up this.

@LJC-zl4ye 7 ай бұрын

I have tried using multiprocessing in the following way, with the code changed from the yolo_map_test_rknn.py file of rknn_model_zoo, but found no change in speed. import concurrent.futures def process_frame(frame): img_src = frame img = co_helper.letter_box(im=img_src.copy(), new_shape=(IMG_SIZE[1], IMG_SIZE[0]), pad_color=(0, 0, 0)) img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) input_data = img outputs = model.run([input_data]) boxes, classes, scores = post_process(outputs, anchors, args) img_p = img_src.copy() if boxes is not None: draw(img_p, co_helper.get_real_box(boxes), scores, classes) return img_p cap = cv2.VideoCapture(0) pool = concurrent.futures.ThreadPoolExecutor(max_workers=8) while True: ret, frame = cap.read() if not ret: break future = pool.submit(process_frame, frame) cv2.imshow("full post process result", future.result()) if cv2.waitKey(1) & 0xFF == ord('q'): break pool.shutdown(wait=True) cap.release() cv2.destroyAllWindows()@@AntonMaltsev

@AntonMaltsev 7 ай бұрын

@@LJC-zl4ye , you may be limited by camera speed here "cap = cv2.VideoCapture(0)" Just try it with inference on a fixed image, something like "frame = np.zeros((480,640,3), np.uint8)" Your camera may have 30/60 frames limitation

@LJC-zl4ye 7 ай бұрын

I've tried this but the average FPS is still only 30.@@AntonMaltsev

@AntonMaltsev 7 ай бұрын

@@LJC-zl4ye did you remove "ret, frame = cap.read()", " cv2.imshow", "cv2.waitKey(1)" etc. from the loop?

@armenkalaidjian4494 5 ай бұрын

Антон, все это будет работать на radxa zero 3w?

@AntonMaltsev 5 ай бұрын

В целом, должно, особенно если версия не с 1GB. Про скорость - не знаю. Вроде чуть медленнее 3568.

@armenkalaidjian4494 5 ай бұрын

@@AntonMaltsev У меня плата будет в конце февраля с 4GB. Попробую на скорость. У тебя сколько кадров в секунду удалось выжать с yolov8s?

@simpleded5454 9 ай бұрын

Добрый день, Антон! Уже давно смотрю ваш канал, но в последний год страдаю от того, что ваши видео выходят на английском. Аудитории сильно не прибавилось, и до сих пор большинство русских, как я понимаю, и поэтому я не вижу особого смысла выпускать ролики в таком формате. Для иностранных слушателей проще выпустить статью на медиуме, так как слушать корявый английский не очень приятно. Простите за негатив, просто небольшой крик души. P.S. Если вы хотите изучить английский, то лучше нанять репетитора и записаться на групповые занятия, так процесс быстрее пойдёт

@AntonMaltsev 9 ай бұрын

1) Прибавилось аудитории сильно 2) Появились иностранные заказчики 3) Познакомился с кучей CEO разных компаний Считаю эксперимент успешным, буду продолжать:)

@AntonMaltsev 9 ай бұрын

telegra.ph/Neskolko-slov-pro-blog-i-vokrug-09-16 - статистика год назад. Сейчас все сильно лучше + я научился чуть аккуратнее с этим работать.

@simpleded5454 9 ай бұрын

@@AntonMaltsev если вам это действительно идёт на пользу в таком ключе, то я рад за вас) Но тогда ждём прокачки языка😏😏😏

@igormotskin 8 ай бұрын

Если у тебя уникальный контент, и людям нужна эта информация, то они будут смотреть и на китайском и на русском. Опять же, если ты можешь решить проблему CEO, то они к тебе будут обращаться, причём не важно какой у тебя уровень английского. Но язык точно нужно улучшать. Существенно улучшать. В любом случае мы желаем тебе добра и творческих успехов 😇

@mrriverhe4768 6 ай бұрын

present it with txt and image is better

@cvabds 4 ай бұрын

@okay730 5 ай бұрын

should I change the classes in the yolo_map_test.py file

@robotmovil 3 ай бұрын

When I try to run code (or other npu related code) I get: E RKNN: [22:57:09.806] failed to open rknpu module, need to insmod rknpu dirver! E RKNN: [22:57:09.806] failed to open rknn device! E Catch exception when init runtime! On Orange PI 5, Ubuntu 22.04 downloaded from the OrangePi web site.

@AntonMaltsev 3 ай бұрын

Don't have any idea about this. It seems that NPU or drivers are not correctly installed. Are you completely on the 1.6 RKNN version?