YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Apertus-8B-Instruct-2509-FP8-BLOCK
This is an FP8 block-wise quantized version of swiss-ai/Apertus-8B-Instruct-2509 using llm-compressor.
Quantization Details
- Quantization Scheme: FP8_BLOCK
- Method: Block-wise FP8 quantization of weights (per-block, static)
- Targets: All Linear layers
- Ignored Layers:
lm_head,mlp.gate(kept in higher precision for better output quality) - Tool: llm-compressor
- Downloads last month
- 1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support