ONNX model quantization#
Quark Quantization API for ONNX.
- class quark.onnx.quantization.api.ModelQuantizer(config: Config)[source]#
Provides an API for quantizing deep learning models using ONNX. This class handles the configuration and processing of the model for quantization based on user-defined parameters.
- Args:
config (Config): Configuration object containing settings for quantization.
- Note:
It is essential to ensure that the ‘config’ provided has all necessary quantization parameters defined. This class assumes that the model is compatible with the quantization settings specified in ‘config’.
- quantize_model(model_input: str | Path | ModelProto, model_output: str | Path | None = None, calibration_data_reader: CalibrationDataReader | None = None, calibration_data_path: str | None = None) ModelProto | None [source]#
Quantizes the given ONNX model and saves the output to the specified path or returns a ModelProto.
- Parameters:
model_input (Union[str, Path, onnx.ModelProto]) – Path to the input ONNX model file or a ModelProto.
model_output (Optional[Union[str, Path]]) – Path where the quantized ONNX model will be saved. Defaults to
None
, in which case the model is not saved but the function returns a ModelProto.calibration_data_reader (Union[CalibrationDataReader, None]) – Data reader for model calibration. Defaults to
None
.
- Returns:
None