Apertus-8B-Instruct-2509-W8A8-CALIBRATED
This is an INT8 dynamically quantized version of swiss-ai/Apertus-8B-Instruct-2509 using llm-compressor.
This version used the fineweb-edu-score-2 dataset for calibration.
Quantization Details
- Quantization Scheme: W8A8
- Method: Dynamic quantization of weights and activations to INT8 (W8A8) format using GPTQModifier
- Targets: All Linear layers
- Ignored Layers:
lm_head(kept in higher precision for better output quality) - Tool: llm-compressor
- Downloads last month
- 2,093
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for sevri/Apertus-8B-Instruct-2509-W8A8-CALIBRATED
Base model
swiss-ai/Apertus-8B-2509
Finetuned
swiss-ai/Apertus-8B-Instruct-2509