Pruning configuration#
Quark Pruning Config API for PyTorch
- class quark.torch.pruning.config.Config(algo_config: AlgoConfig | None = None, blockwise_tuning_config: AlgoConfig | None = None, log_severity_level: int | None = 1)[source]#
A class that encapsulates comprehensive pruning configurations for a machine learning model, allowing for detailed and hierarchical control over pruning parameters across different model components.
- Parameters:
algo_config (Optional[AlgoConfig]) – Optional configuration for the pruning algorithm, such as OSSCAR. After this process, the params will be reduced. Default is None.
log_severity_level (Optional[int]) – 0:DEBUG, 1:INFO, 2:WARNING. 3:ERROR, 4:CRITICAL/FATAL. Default is 1.
- class quark.torch.pruning.config.OSSCARConfig(name: 'str' = 'osscar', damp_percent: 'float' = 0.01, true_sequential: 'bool' = True, inside_layer_modules: 'list[str]' = [], mlp_pruning_modules: 'list[str]' = [], mlp_scaling_layers: 'dict[str, str | None]' = {}, mlp_pruning_ratio: 'float' = 0.1, mlp_intermediate_size_name: 'str' = '', model_decoder_layers: 'str' = '')[source]#
- class quark.torch.pruning.config.LayerImportancePruneConfig(name: str = 'layer_importance_depth_pruning', delete_layer_num: int = 1, delete_layers_index: list[int] = [], save_gpu_memory: bool = False, layer_num_field: str = '', model_decoder_layers: str = '', layer_norm_field: str = '')[source]#
Configuration for layer importance depth wise prune algorithm (for LLM model).
- Parameters:
delete_layer_num (int) – Number of layers to delete (at least 1).
delete_layers_index (List[int]) – Specific indexes of layers to delete.
save_gpu_memory (bool) – Whether to save GPU memory (tradeoff speed vs. memory).
layer_num_field (str) – Field name for number of layers.
model_decoder_layers (str) – Field name for decoder layers.
layer_norm_field (str) – Field name for normalization layer.
- class quark.torch.pruning.config.BlockwiseTuningConfig(name: 'str' = 'blockwise_tuning', epochs: 'int' = 5, weight_lr: 'float' = 0.0001, weight_decay: 'float' = 0.0, min_lr_factor: 'float' = 20.0, max_grad_norm: 'float' = 0.3, model_decoder_layers: 'str' = '', trainable_modules: 'list[str]' = [])[source]#