Apertus-8B-Instruct-2509-W8A8-CALIBRATED

This is an INT8 dynamically quantized version of swiss-ai/Apertus-8B-Instruct-2509 using llm-compressor.

This version used the fineweb-edu-score-2 dataset for calibration.

Quantization Details

  • Quantization Scheme: W8A8
  • Method: Dynamic quantization of weights and activations to INT8 (W8A8) format using GPTQModifier
  • Targets: All Linear layers
  • Ignored Layers: lm_head (kept in higher precision for better output quality)
  • Tool: llm-compressor
Downloads last month
2,093
Safetensors
Model size
8B params
Tensor type
BF16
·
I8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sevri/Apertus-8B-Instruct-2509-W8A8-CALIBRATED

Quantized
(31)
this model