YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Apertus-8B-Instruct-2509-FP8-BLOCK

This is an FP8 block-wise quantized version of swiss-ai/Apertus-8B-Instruct-2509 using llm-compressor.

Quantization Details

  • Quantization Scheme: FP8_BLOCK
  • Method: Block-wise FP8 quantization of weights (per-block, static)
  • Targets: All Linear layers
  • Ignored Layers: lm_head, mlp.gate (kept in higher precision for better output quality)
  • Tool: llm-compressor
Downloads last month
1
Safetensors
Model size
8B params
Tensor type
BF16
·
F8_E4M3
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support