quark.torch.quantization.observer.tqt_observer
#
Module Contents#
Classes#
- class quark.torch.quantization.observer.tqt_observer.TQTObserver(qspec: quark.torch.quantization.config.config.QuantizationSpec, device: torch.device | None = None)#
Observer for uniform scaling quantizer. For example ‘int uniform quantizer’ or ‘fp8 uniform scaling’.
- get_fix_position() int #
TQT: qx = clip(round(fx / scale)) * scale, scale = 2^ceil(log2t) / 2^(b-1)
(2) NndctFixNeron: qx = clip(round(fx * scale)) * (1 / scale), scale = 2^fp Let (1) equals (2), we can get (3): 2^(b-1) / 2^ceil(log2t) = 2^fp
=> fp = b - 1 - ceil(log2t)
For more details, see nndct/include/cuda/nndct_fix_kernels.cuh::_fix_neuron_v2_device