quark.torch.extensions.brevitas.api#

Module Contents#

Classes#

class quark.torch.extensions.brevitas.api.ModelQuantizer(config: quark.torch.extensions.brevitas.config.Config)#

Provides an API for quantizing deep learning models using Brevitas.

The way this class interacts with Brevitas is based on the brevitas ptq example found here: Xilinx/brevitas

Example usage:

weight_spec = QuantizationSpec() global_config = QuantizationConfig(weight=weight_spec) config = Config(global_quant_config=global_config) quantizer = ModelQuantizer(config) quant_model = quantizer.quantize_model(model, calib_dataloader)

quantize_model(model: torch.nn.Module, calib_loader: torch.utils.data.DataLoader | None = None) torch.nn.Module#

Quantizes the given model.

  • model: The model to be quantized.

  • calib_loader: A dataloader for calibration data, technically optional but required for most quantization processes.

class quark.torch.extensions.brevitas.api.ModelExporter(export_path: str)#

Provides an API for exporting pytorch models quantized with Brevitas. This class converts the quantized model to an onnx graph, and saves it to the specified export_path.

Example usage:

exporter = ModelExporter(“model.onnx”) exporter.export_onnx_model(quant_model, args=torch.ones(1, 1, 784))

export_onnx_model(model: torch.nn.Module, args: torch.Tensor | Tuple[torch.Tensor]) None#

Exports a model to onnx.

  • model: The pytorch model to export.

  • args: Representative tensor(s) in the same shape as the expected input(s) (can be zero, random, ones or even real data).