PyTorch model export configuration#

Quark Exporting Config API for PyTorch

class quark.torch.export.config.config.ExporterConfig(json_export_config: JsonExporterConfig, onnx_export_config: OnnxExporterConfig | None = None)[source]#

A class that encapsulates comprehensive exporting configurations for a machine learning model, allowing for detailed control over exporting parameters across different exporting formats.

Parameters:

json_export_config (Optional[JsonExporterConfig]) – Global configuration for json-safetensors exporting.
onnx_export_config (Optional[OnnxExporterConfig]) – Global configuration onnx exporting. Default is None.

class quark.torch.export.config.config.JsonExporterConfig(weight_merge_groups: List[List[str]] | None = None, kv_cache_group: List[str] = [], min_kv_scale: float = 0.0, weight_format: str = 'real_quantized', pack_method: str = 'reorder')[source]#

A data class that specifies configurations for json-safetensors exporting.

Parameters:

weight_merge_groups (Optional[List[List[str]]]) – A list of operators group that share the same weight scaling factor. These operators’ names should correspond to the original module names from the model. Additionally, wildcards can be used to denote a range of operators. Default is None.
kv_cache_group (List[str]) – A list of operators group that should be merged to kv_cache. These operators’ names should correspond to the original module names from the model. Additionally, wildcards can be used to denote a range of operators. Defaults to [].
min_kv_scale (float) – Minimum kv scale. Defaults to 0.0.
weight_format (str) – The flag indicating whether to export the real quantized weights. Defaults to "real_quantized".
pack_method (str) – The flag indicating whether to reorder the quantized tensors. Defaults to "reorder".

class quark.torch.export.config.config.OnnxExporterConfig[source]#: A data class that specifies configurations for onnx exporting.

PyTorch model export configuration

Contents

PyTorch model export configuration#