Best Practice for Ryzen AI in Quark ONNX#
This topic outlines best practice for Post-Training Quantization (PTQ) in Quark ONNX. It provides guidance on fine-tuning your quantization strategy to meet target quantization accuracy.

Figure 1. Best Practices for Quark ONNX Quantization#
Pip Requirements#
Install the necessary python packages:
python -m pip install -r requirements.txt
Prepare model#
Download the ONNX float model from the onnx/models repo directly:
wget -P models https://github.com/onnx/models/raw/new-models/vision/classification/resnet/model/resnet50-v1-12.onnx
Prepare Calibration Data#
You can provide a folder containing PNG or JPG files as calibration data folder. For example, you can download images from microsoft/onnxruntime-inference-examples as a quick start. Specifically, you can provide the preprocessing code at line 63 in quantize_quark.py
mkdir calib_data
wget -O calib_data/daisy.jpg https://github.com/microsoft/onnxruntime-inference-examples/blob/main/quantization/image_classification/cpu/test_images/daisy.jpg?raw=true
Auto search for RyzenAI quantization#
- build search space
Search space is a set of parameters to define the searching item. In the search space, we will list out all the possible combination of the config. An example is like below:
search_space_advanced: dict[str, any] = {
"calibrate_method": [CalibrationMethod.MinMax, CalibrationMethod.Percentile],
"activation_type": [QuantType.QInt8, QuantType.QInt16,],
"weight_type": [QuantType.QInt8,],
"include_fast_ft": [False, True],
"extra_options": {
'ActivationSymmetric': [True, False],
'WeightSymmetric': [True],
'FastFinetune': {
'DataSize': [200,],
'NumIterations': [1000],
'OptimAlgorithm': ['adaround'],
'LearningRate': [0.1],
'OptimDevice': ['cuda:0'],
'InferDevice': ['cuda:0'],
'EarlyStop': [False],
}
}
}
When needing build more than one search space, you can build many space according to your preference and concatenate all of them:
space1 = auto_search_ins.build_all_configs(auto_search_config.search_space_XINT8)
space2 = auto_search_ins.build_all_configs(auto_search_config.search_space)
auto_search_ins.all_configs = space1 + space2
evaluator
Evaluator is a customer-defined function which use the onnx model as input and output the metric. Based on this metric and the metric drop tolerance, auto search decide wether to stop the searching process. If set None, auto search will call the build-in evalutor.
There are two ways to define evaluator function: - defined in auto_search_config as a static method:
class AutoSearchConfig_Default:
# 1) define search space
# 2) define search_metric, search_algo
# 3) define search_metric_tolerance, search_cache_dir, etc
@staticmethod
def customer_defined_evaluator(onnx_path, **args):
# step 1) build onnx inference session
# step 2) model post-processing if needed
# step 3) build evaluation dataloader
# step 4) calcuate the metric
# step 5) clean cache if needed
# step 6) return the metric
search_evaluator = customer_defined_evaluator
instance a auto_search_config and assign the evaluator function:
def customer_defined_evaluator(onnx_path, **args):
# step 1) build onnx inference session
# step 2) model post-processing if needed
# step 3) build evaluation dataloader
# step 4) calcuate the metric
# step 5) clean cache if needed
# step 6) return the metric
auto_search_conig = AutoSearchConfig_Default()
auto_search_config.search_evaluator = customer_defined_evaluator
metric
If evalutor is not None, metric is defined in the evaluator. If evalutor is None, we can support the metrics such as “L2”, “L1”, “cos”, “psnr” and “ssim”. Default is “L2”.
target setting
Target setting is the acceptable drop of metric. For example, we can set the search metric is “L2”. And the target is the L2 distance between float model and quantized model is within 0.1.
search_metric: str = "L2"
search_metric_tolerance: float = 0.1
stop condition
When target meets, the search process will stop and save the searched result.
execution
Auto search execution command:
python quantize_quark.py --input_model_path models/resnet50-v1-12.onnx --calib_data_path calib_data --output_model_path models