Apertus-8B-Instruct-2509-W8A8

This is an INT8 dynamically quantized version of swiss-ai/Apertus-8B-Instruct-2509 using llm-compressor.

No calibration data was used.

Quantization Details

  • Quantization Scheme: W8A8
  • Method: Dynamic quantization of weights and activations to INT8 (W8A8) format using GPTQModifier
  • Targets: All Linear layers
  • Ignored Layers: lm_head (kept in higher precision for better output quality)
  • Tool: llm-compressor
Downloads last month
30
Safetensors
Model size
8B params
Tensor type
BF16
·
I8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sevri/Apertus-8B-Instruct-2509-W8A8

Quantized
(31)
this model