ONNX model calibration

ONNX model calibration#

quark.onnx.calibration.interface.run_calibration(model_input: str | Path | ModelProto, data_reader: CalibrationDataReader, op_types_to_calibrate: Sequence[str] | None = None, activation_type: QuantType = QuantType.QInt8, calibrate_method: CalibrationMethod | LayerWiseMethod | PowerOfTwoMethod = CalibrationMethod.MinMax, use_external_data_format: bool = False, execution_providers: list[str] | None = ['CPUExecutionProvider'], quantized_tensor_type: dict[Any, Any] = {}, extra_options: dict[str, Any] = {}) TensorsData[source]#

This is an interface function used for calibration.

Parameters:
  • model_input (Union[str, Path, onnx.ModelProto]) – ONNX model to calibrate.

  • data_reader (CalibrationDataReader) – Data reader for model calibration.

  • op_types_to_calibrate (Optional[Sequence[str]]) – List of operator types to calibrate. Defaults to None, which indicates that all float32/float16 tensors are calibrated.

  • activation_type (QuantType) – The quantization type of activation. Default is QuantType.QInt8.

  • calibrate_method (Union[CalibrationMethod, LayerWiseMethod, PowerOfTwoMethod]) – Calibration method to use (MinMax, Entropy, Percentile, Distribution, NonOverflow or MinMSE).

  • use_external_data_format (bool) – Whether to use external data format for large models.

  • execution_providers (Union[List[str], None]) – List of execution providers for ONNX Runtime.

  • extra_options (Dict[str, Any]) – Extra options for quantization, which contains additional options for calibrator configuration.

Returns:

Data range for each quantizing tensor.

quark.onnx.calibration.interface.fake_calibration(model_input: str | Path | ModelProto) TensorsData[source]#

A calibration function that produces fake tensor range of [0,1], intended for scenarios that don’t need actual calibration, such as block FP quantization, to accelerate the entire process.

Parameters:

model_input (Union[str, Path, onnx.ModelProto]) – ONNX model to calibrate.

Returns:

Data range for each quantizing tensor.