Installation Guide#

Prerequisites#

  1. Python 3.9, 3.10, or 3.11 is required. Python 3.12 is currently unsupported.

  2. Install PyTorch for your compute platform (such as CUDA, ROCM, and CPU). Ensure that the version of PyTorch is 2.2.0 or higher.

  3. Install ONNX version 1.16.0 or later, ONNX Runtime version 1.17.0 or later, but earlier than 1.20.0, and onnxruntime-extensions version 0.4.2 or later.

Note

When installing on Windows, Visual Studio is necessary, with Visual Studio 2022 being the minimum required version. During the compilation process, you can choose one of the following methods:

  1. Use the Developer Command Prompt for Visual Studio. When installing Visual Studio, ensure to include the Developer Command Prompt. You can execute the programs in the CMD window of the Developer Command Prompt for Visual Studio.

  2. Manually add paths to the environment variables. The tools cl.exe, MSBuild.exe, and link.exe from Visual Studio are utilized. Ensure that their paths are included in the PATH environment variable. These programs can be found in the Visual Studio installation directory. In the Edit Environment Variables window, click New, and then paste the path to the folder containing the cl.exe, link.exe, and MSBuild.exe files. Click OK on all the windows to apply the changes.

Installation#

Install from ZIP#

Step 1: Download and unzip 📥*quark-*.zip* which has a wheel package in it. You can download wheel package 📥*quark-*.whl* directly.

📥quark.zip release_version (recommend)

📥quark.whl release_version

Directory Structure of the zip file:

+ quark.zip
   + quark.whl
   + examples    # Examples code of Quark
   + docs        # Off-line documentation of Quark.
   + README.md

We strongly recommend you to download the zip file, as it includes examples compatible with the wheel package version.

Step 2: Install the quark wheel package by running the following command:

pip install [quark wheel package].whl

Installation Verification#

  1. (Optional) Verify the installation by running python -c "import quark". If no error is reported, the installation is successful.

  2. (Optional) Compile the fast quantization kernels. When using Quark’s quantization APIs for the first time, it compiles the fast quantization kernels using your installed Torch and CUDA, if available. This process might take a few minutes, but the subsequent quantization calls are much faster. To invoke this compilation now and check if it is successful, run the following command:

    python -c "import quark.torch.kernel"
    
  3. (Optional) Compile the custom operators library. When using Quark-ONNX’s custom operators for the first time, it compiles the custom operators library using your local environment. To invoke this compilation now and check if it is successful, run the following command:

    python -c "import quark.onnx.operators.custom_ops"
    

Older Versions#