quark.torch.export.nn.modules.realquantizer
#
Module Contents#
Classes#
- class quark.torch.export.nn.modules.realquantizer.RealQuantizerBase#
Helper class that provides a standard way to create an ABC using inheritance.
- class quark.torch.export.nn.modules.realquantizer.RealQuantizer(qspec: quark.torch.quantization.config.config.QuantizationSpec, quantizer: quark.torch.quantization.tensor_quantize.FakeQuantizeBase | None, reorder: bool, real_quantized: bool, float_dtype: torch.dtype, device: torch.device | None = torch.device('cuda'), scale_shape: Tuple[int, Ellipsis] | None = None, zero_point_shape: Tuple[int, Ellipsis] | None = None)#
On export, performs transpose on scale and pack on zeropint. Called by parent class, performs real quantization on weight, bias. On import, performs dequantization of weight, bias, and fakequantization of input, output via forward method.
- to_real_quantize_params(param: torch.Tensor) torch.Tensor #
Quantize weight and bias on low-bit precision datatypes, and pack them if required.