ONNX model quantization#

Quark Quantization API for ONNX.

class quark.onnx.quantization.api.ModelQuantizer(config: Config)[source]#

Provides an API for quantizing deep learning models using ONNX. This class handles the configuration and processing of the model for quantization based on user-defined parameters.

Args:: config (Config): Configuration object containing settings for quantization.
Note:: It is essential to ensure that the ‘config’ provided has all necessary quantization parameters defined. This class assumes that the model is compatible with the quantization settings specified in ‘config’.

Quantizes the given ONNX model and saves the output to the specified path or returns a ModelProto.

Parameters:

model_input (Union[str, Path, onnx.ModelProto]) – Path to the input ONNX model file or a ModelProto.
model_output (Optional[Union[str, Path]]) – Path where the quantized ONNX model will be saved. Defaults to None, in which case the model is not saved but the function returns a ModelProto.
calibration_data_reader (Union[CalibrationDataReader, None]) – Data reader for model calibration. Defaults to None.

Returns:

None

ONNX model quantization

Contents

ONNX model quantization#