Common techniques for embedding data preprocessing into AI models

Author: Zhan Pengzhou Intel IoT Industry Innovation Ambassador

This article will introduce common techniques for embedding data preprocessing into AI models based on the OpenVINOTM model optimizer or preprocessing API, helping readers to further improve End-to-end performance of AI inference programs . All sample programs in this article are open source: https://gitee.com/ppov-nuc/resnet_ov_ppp.git, and are based on the 12th generation Intel® Core™ processor AI Developer Kit Complete the test on .

         by YOLOv5 model conversion As an example, using the model optimizer command:

mo --input_model yolov5s.onnx --data_type FP16

Convert the yolov5s.onnx model to an IR format model, and convert the model accuracy from FP32 to FP16 - this is the most common way to use the model optimizer. Based on the above IR model, when writing an AI reasoning program, since the numerical accuracy and shape of the image data are different from the numerical accuracy and shape required by the model input node, it is necessary to preprocess the data before inputting it into the model.

         by YOLOv5 model As an example, use the zidane.jpg image that comes with the YOLOv5 code warehouse to print out the numerical accuracy and shape of the image, as well as the numerical accuracy and shape of the model input nodes. The comparison is as follows, as shown in Figure 1-1.

1 Image read by OpenCV vs model input node

As can be seen from the above figure, the image data read by OpenCV's imread() function is different from the requirements of the model input node in terms of data shape, numerical accuracy, and numerical range, as shown in the following table

Image data read by OpenCV

YOLOv5 model input node

Data shape (Shape)

[720, 1280, 3]

[1, 3, 640, 640]

Numeric precision (dtype)

UINT8

FP32

Value range

0 - 255

0.0 - 1.0

color channel order

BGR

RGB

Data layout (Layout)

HWC

NCHW

Due to the above differences, the data must be preprocessed before being passed into the model to meet the requirements of the model input node. Data preprocessing can be implemented by programming in the inference code, by using the model optimizer, or by using the OpenVINOTM preprocessing API, which will be introduced in detail in this article.

1.1 Data preprocessing with model optimizer

1.1.1 Model optimizer preprocessing parameters

The model optimizer can embed preprocessing operations such as color channel order adjustment and image data normalization into the model, refer to " Interpretation of key points of OpenVINO™ model conversion technology ". By specifying parameters:

  • --mean_values: All input data will be subtracted from mean_values, ie input - mean_values
  • --scale_values: All input data will be divided by scales_values. When both mean_values ​​and scale_values ​​are specified, the model optimizer executes (input - mean_values)÷scales_values
  • --reverse_input_channels: Convert input channel order from RGB to BGR (and vice versa)

When the above three operations are specified at the same time, the preprocessing sequence is:

Input data→reverse_input_channels→mean_values→scale_values→original model

When converting the model, assuming that the inference program uses the OpenCV library to read the image, you can add three parameters, mean_values, scale_values, and reverse_input_channels, to the model optimizer, and embed the color channel order adjustment and image data normalization operations into the model. If the inference program uses a non-OpenCV library to read images, such as PIL.Image, there is no need to add the --reverse_input_channels parameter.

The following article will take the ResNet model as an example to show the complete process of using the model optimizer to embed preprocessing into the model.

1.1.2 Embedding the preprocessing of the ResNet model into the model

ResNet is not only the champion of the 2015 ILSVRC competition, but also a convolutional neural network model commonly used in industrial practice. PyTorch has integrated ResNet into torchvision, and converted the ResNet model in PyTorch format to ONNX format. The complete code is as follows:

from torchvision.models import resnet50, ResNet50_Weights

import torch

# https://pytorch.org/vision/stable/models/generated/torchvision.models.resnet50.html

weights = ResNet50_Weights.IMAGENET1K_V2

model = resnet50(weights=weights, progress=False).cpu().eval()

# define input and output node

dummy_input = torch.randn(1, 3, 224, 224, device="cpu")

input_names,  output_names = ["images"], ['output']

torch.onnx.export(model,

                 dummy_input,

                 "resnet50.onnx",

                 verbose=True,

                 input_names=input_names,

                 output_names=output_names,

                 opset_version=13

                 )

When exporting the PyTorch format model to ONNX format, it should be noted that Operator version (opset_version) preferably ≥ 11 . in addition, OpenVINO2022.2 supports ONNX 1.8.1 ,Right now opset_version=13 , so this article sets opset_version to 13.

The normalization parameters of the ResNet model trained on the ImageNet 1k dataset are:

  • mean_values= [123.675,116.28,103.53]
  • scale_values=[58.395,57.12,57.375]

The command to convert the ONNX model to the OpenVINO™ IR model is:

mo -m resnet50.onnx --mean_values=[123.675,116.28,103.53] --scale_values=[58.395,57.12,57.375]  --data_type FP16 --reverse_input_channels

After obtaining the IR model of ResNet50, you can use the following program to complete the inference calculation

from openvino.runtime import Core

import cv2

import numpy as np

core = Core()

resnet50 = core.compile_model("resnet50.xml", "CPU")

output_node = resnet50.outputs[0]

# Resize

img = cv2.resize(cv2.imread("cat.jpg"), [224,224])

# Layout: HWC -> NCHW

blob = np.expand_dims(np.transpose(img, (2,0,1)), 0)

result = resnet50(blob)[output_node]

print(np.argmax(result))

In the above reasoning code, operations such as adjusting image size and changing image data layout are still implemented in the reasoning code. Next, this article will introduce the use of OpenVINOTM preprocessing API to embed more preprocessing operations into the model.

1.2 Data preprocessing with OpenVINO™ preprocessing

   Starting from OpenVINO™ 2022.1, OpenVINO provides a set of preprocessing API s to embed data preprocessing into the model, refer to " Use OpenVINO™ preprocessing API to further improve YOLOv5 inference performance ". The benefits of embedding data preprocessing into the model are:

  • Improve the portability of AI models (reasoning code does not need to consider writing preprocessing programs)
  • Improve utilization of inference devices (e.g., Intel® Integrated Graphics/Discrete Graphics)
  • Improve the end-to-end performance of AI programs

Use the OpenVINOTM preprocessing API to embed the preprocessing into the complete sample program export_resnet_ov_ppp.py, as follows:

from openvino.preprocess import PrePostProcessor, ColorFormat, ResizeAlgorithm

from openvino.runtime import Core, Layout, Type, serialize

# ========  Step 0: read original model =========

core = Core()

model = core.read_model("resnet50.onnx")

# ======== Step 1: Preprocessing ================

ppp = PrePostProcessor(model)

# Declare section of desired application's input format

ppp.input("images").tensor() \

    .set_element_type(Type.u8) \

    .set_spatial_dynamic_shape() \

    .set_layout(Layout('NHWC')) \

    .set_color_format(ColorFormat.BGR)

# Specify actual model layout

ppp.input("images").model().set_layout(Layout('NCHW'))

# Explicit preprocessing steps. Layout conversion will be done automatically as last step

ppp.input("images").preprocess() \

    .convert_element_type()     \

    .convert_color(ColorFormat.RGB) \

    .resize(ResizeAlgorithm.RESIZE_LINEAR) \

    .mean([123.675, 116.28, 103.53]) \

    .scale([58.624, 57.12, 57.375])

# Dump preprocessor

print(f'Dump preprocessor: {ppp}')

model = ppp.build()

# ======== Step 2: Save the model with preprocessor================

serialize(model, 'resnet50_ppp.xml', 'resnet50_ppp.bin')

The export_resnet_ov_ppp.py running result is shown in the figure below:

As can be seen from the above code, using the OpenVINOTM preprocessing API, image resizing, color channel conversion, data normalization, and data layout conversion can all be integrated into the model, and the ONNX model can be exported without running the model optimizer as IR model.

The complete reasoning program based on resnet50_ppp.xml is as follows:

from openvino.runtime import Core

import cv2

import numpy as np

core = Core()

resnet50_ppp = core.compile_model("resnet50_ppp.xml", "CPU")

output_node = resnet50_ppp.outputs[0]

blob = np.expand_dims(cv2.imread("cat.jpg"),0)

result = resnet50_ppp(blob)[output_node]

print(np.argmax(result))

As shown above, based on the embedded preprocessing IR model, the OpenVINO reasoning program becomes simpler, clearer, and easier to read and understand. Five lines of Python core code implements ResNet model reasoning with embedded preprocessing!

1.3 Using model caching technology to further shorten the first inference delay

         exist" Implementing the OpenVINO asynchronous reasoning program of the YOLOv5 model on Viper Canyon "discusses the end-to-end performance of AI applications. For first inference latency, the loading and compilation time of the model can greatly increase the end-to-end runtime of first inference.

The use of model caching technology will greatly shorten the delay of the first inference, as shown in the figure below.

  To use model caching technology, you only need to add one line of code: core.set_property({'CACHE_DIR': './cache/ppp'}), the complete sample code is as follows:

from openvino.runtime import Core

import cv2

import numpy as np

core = Core()

core.set_property({'CACHE_DIR': './cache/ppp'}) # Use model caching technology

resnet50_ppp = core.compile_model("resnet50_ppp.xml", "CPU")

output_node = resnet50_ppp.outputs[0]

blob = np.expand_dims(cv2.imread("cat.jpg"),0)

result = resnet50_ppp(blob)[output_node]

print(np.argmax(result))

When the inference program is run for the second time, the OpenVINOTM runtime will directly load the compiled model from the cache folder, which greatly optimizes the first inference delay.

1.4 Summary

This paper introduces in detail the technology of embedding data preprocessing into AI models through the model optimizer and OpenVINOTM preprocessing API. Embedding data preprocessing into the model simplifies the writing of inference programs, improves the utilization of inference computing equipment, and improves the end-to-end performance of AI programs. Finally, this article also introduces the model caching technology to further optimize the end-to-end first inference latency performance of AI programs.

Tags: OpenCV Computer Vision AI OpenVINO

Posted by shibobo12 on Wed, 21 Dec 2022 14:57:45 +1030