sevri
/

Apertus-8B-Instruct-2509-W8A8-CALIBRATED

8-bit precision

compressed-tensors

Model card Files Files and versions

Apertus-8B-Instruct-2509-W8A8-CALIBRATED

This is an INT8 dynamically quantized version of swiss-ai/Apertus-8B-Instruct-2509 using llm-compressor.

This version used the fineweb-edu-score-2 dataset for calibration.

Quantization Details

Quantization Scheme: W8A8
Method: Dynamic quantization of weights and activations to INT8 (W8A8) format using GPTQModifier
Targets: All Linear layers
Ignored Layers: lm_head (kept in higher precision for better output quality)
Tool: llm-compressor

Downloads last month: 2,093

Safetensors

Model size

8B params

Tensor type

BF16

·

I8

·

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sevri/Apertus-8B-Instruct-2509-W8A8-CALIBRATED

Base model

swiss-ai/Apertus-8B-2509

Finetuned

swiss-ai/Apertus-8B-Instruct-2509

Quantized

(31)

this model