quark.onnx.quantizers.matmul_nbits_quantizer
#
Module Contents#
Classes#
- class quark.onnx.quantizers.matmul_nbits_quantizer.MatMulNBitsQuantizer(model: onnx.onnx_pb.ModelProto | str, block_size: int = 128, is_symmetric: bool = False, bits: int = 4, accuracy_level: int | None = None, nodes_to_exclude: List[str] | None = None, algo_config: WeightOnlyQuantConfig | None = None, extra_options: Dict[str, Any] = {})#
Perform 4b quantization of constant MatMul weights