Exporting Quantized Models#
Quark torch not only supports our own torch export format Quark format (Json-Pth), but also support exporting in popular formats requested by downstream tools, including ONNX, format for Hugging Face & vLLM (HF format), and GGUF.
For diffusion models, quark.torch.export_safetensors detects a
HuggingFace Diffusers ModelMixin and writes a checkpoint that reloads
directly through DiffusionPipeline.from_pretrained /
ModelMixin.from_pretrained. See
Using Quark-Quantized Diffusion Models with HuggingFace Diffusers.