Tips for converting models using the Acuity Toolkit

Quantization

When using the Acuity Toolkit to quantize the model, the default quantizer is "asymmetric affine" and the default qtype is "uint8". The default algorithm for specifying quantization parameters is "Min-Max". For models with significant outliers in their per-layer weight distribution, quantizing with "Min-Max" and 8-bit precision is likely to result in a high accuracy loss. For detailed information, please refer to Accuracy Degradation After Quantization .

When the loss in accuracy is unbearable, it is recommended to quantize the model using int16 precision or to use an algorithm other than "Min-Max".

The chart below shows the histogram of activations for a particular convolutional layer in RTMDet-s, using images from the COCO dataset as input. The dashed line in red represents the clip range determined by the algorithm "Min-Max", while the green one represents the clip range determined by the algorithm "Entropy(KL-Divergence)". The "Entropy" algorithm selects the clip range by minimizing information loss between the original and the quantized distributions. Intuitively, this should lead to better quantization, thus reducing accuracy loss.

For example, when quantizing RTMDet-s to 8-bit precision, most of the output bounding boxes become incorrect and hardly usable. In contrast, quantizing to 16-bit precision results in much more tolerable accuracy loss.

In addition, it is also possible to perform hybrid int16/int8 quantization, which can also achieve less accuracy loss. However, the Acuity Toolkit does not currently support exporting hybrid NBG.

Dynamic Shape

The model to be imported to Acuity shall not contain dynamic shapes. In other words, given the shape of the input tensor, without executing the graph, the shapes of all other tensors should be inferable solely based on the information about the operators.

Input dynamic shape

In some ONNX models, the input shape is float32 [p2o.DynamicDimension.0,2,50,17,1], where an undefined dimension appears. We need to fix this dimension based on the analysis of the model, making it a static numerical value.

Output dynamic shape

The output shape is [p2o.DynamicDimension.1,p2o.DynamicDimension.2], We need to fix this dimension to make it a static numerical value.

Non-maximum Suppression(NMS) is very common in object detection tasks due to the need to reduce output bounding boxes. Only after executing NMS can we determine how many bounding boxes there are. The shape of the output produced this way is known as dynamic shape.

A common practice is to remove the post-processing part from the object detection model before importing it to Acuity.

Modify the Model Filedetermine how many bounding boxes there are after executing NMS

The Acuity toolkit does not support operators with "unknown input size" or "dynamic shape tensor" in ONNX model files. If this situation occurs, we need to make some modifications to the ONNX model.

There are two ways：

Using the the onnx API to modify it with code.
Third-party tools ONNX Modifyer.

The onnx API

For example, STGCN.onnx model properties are like this, we need to fix the dynamic inputs and outputs.

Here is the sample code:

import onnx
from onnx import shape_inference

# load ONNX model
model = onnx.load("STGCN.onnx")

# Obtain input and output information of the model
input_name = model.graph.input[0].name
output_name = model.graph.output[0].name

# Modify the shape of inputs and outputs
model.graph.input[0].type.tensor_type.shape.dim[0].dim_value = 1
model.graph.output[0].type.tensor_type.shape.dim[0].dim_value = 1
model.graph.output[0].type.tensor_type.shape.dim[1].dim_value = 2

# Inferring the shape of the new model
inferred_model = shape_inference.infer_shapes(model)

# Save the modified model
onnx.save(inferred_model, "modified_STGCN.onnx")

After saving the modified model, you can see the modified properties：

The onnx Modifyer Tool

we can modify the model using the ONNX Modifyer tool.

Onnx Modifier (https://github.com/ZhangGe6/onnx-modifier)

It can add and delete nodes. Rename the node or model inputs and outputs. Add new model inputs and outputs. Edit attribute of nodes and model initializers.