quark.onnx.quantizers.matmul_nbits_quantizer

Contents

`quark.onnx.quantizers.matmul_nbits_quantizer`#

Module Contents#

Classes#

class quark.onnx.quantizers.matmul_nbits_quantizer.MatMulNBitsQuantizer(model: onnx.onnx_pb.ModelProto | str, block_size: int = 128, is_symmetric: bool = False, bits: int = 4, accuracy_level: int | None = None, nodes_to_exclude: List[str] | None = None, algo_config: WeightOnlyQuantConfig | None = None, extra_options: Dict[str, Any] = {})#: Perform 4b quantization of constant MatMul weights