YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Apertus-8B-Instruct-2509-FP8-BLOCK

This is an FP8 block-wise quantized version of swiss-ai/Apertus-8B-Instruct-2509 using llm-compressor.

Quantization Details

Quantization Scheme: FP8_BLOCK
Method: Block-wise FP8 quantization of weights (per-block, static)
Targets: All Linear layers
Ignored Layers: lm_head, mlp.gate (kept in higher precision for better output quality)
Tool: llm-compressor

Safetensors

Model size

8B params

Tensor type

BF16

F8_E4M3

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support