Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

If PPU is not used to accelerate the vsi_nn_Float32ToDtype() process, the following CPU methods can be used to accelerate the process.

As the VSI generated code, for example: yolov5s_uint8_nbg_unify, we need to do some modification for the target.

This document is for both uint8 and int16 format.

Steps

  1. Add vnn_PreTableInit in vnn_pre_process.h

vnn_pre_process_h_1.jpg
  1. Add vnn_PreTableInit and uint8 to dtype table named u2d in vnn_pre_process.c for the pre lookup table.

vnn_pre_process_c_1.jpgvnn_pre_process_c_3.jpg
  1. Modify the original _float32_to_dtype() as using table lookup instead of direct calling VSI API.

vnn_pre_process_c_2.jpg
  1. Create u2d table before open image in main.c.

main_c_1.jpg
  1. Add omp option in Makefile.

image-20240425-024252.png

Test result

The following figure shows the measured performance data after we made the modifications as mentioned in the above steps.

uint8

It took 0.05ms to create the table and 4.70ms to convert the table lookup.

image-20240424-084000.png

int16

It took 0.05ms to create the table and 5.09ms to convert the table lookup.

image-20240424-084032.png

  • No labels