quark.onnx.finetuning.create_torch.quant_base_ops

quark.onnx.finetuning.create_torch.quant_base_ops#

Module Contents#

Classes#

class quark.onnx.finetuning.create_torch.quant_base_ops.QuantizationModule(quant_info: Tuple[numpy.typing.NDArray[numpy.float32], numpy.typing.NDArray[Any], numpy.typing.NDArray[Any], numpy.typing.NDArray[Any], int, bool, onnx.onnx_pb.TensorProto] | Dict[str, Any] | None)#

A pytorch module that behaves as ONNX quantization nodes

class quark.onnx.finetuning.create_torch.quant_base_ops.QuantizeWrapper(w_alpha: float = 1.0, b_beta: float = 1.0, **kwargs: Dict[str, Any])#

A wrapper for torch layer’s input/weight/bias quantization