sevri
/

Apertus-8B-Instruct-2509-W8A8

8-bit precision

compressed-tensors

Model card Files Files and versions

Apertus-8B-Instruct-2509-W8A8

This is an INT8 dynamically quantized version of swiss-ai/Apertus-8B-Instruct-2509 using llm-compressor.

No calibration data was used.

Quantization Details

Quantization Scheme: W8A8
Method: Dynamic quantization of weights and activations to INT8 (W8A8) format using GPTQModifier
Targets: All Linear layers
Ignored Layers: lm_head (kept in higher precision for better output quality)
Tool: llm-compressor

Downloads last month: 30

Safetensors

Model size

8B params

Tensor type

BF16

·

I8

·

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sevri/Apertus-8B-Instruct-2509-W8A8

Base model

swiss-ai/Apertus-8B-2509

Finetuned

swiss-ai/Apertus-8B-Instruct-2509

Quantized

(31)

this model