ShowUI-2B 6bit

This is a 6-bit quantized MLX conversion of showlab/ShowUI-2B, optimized for Apple Silicon.

ShowUI is a lightweight 2B vision-language-action model designed for GUI agents. Upstream, it is framed around GUI grounding and UI navigation, with point-style localization and atomic action dictionaries over screenshots.

This artifact was derived from the validated local MLX bf16 reference conversion and then quantized with mlx-vlm. It was validated locally with both mlx_vlm prompt-packet checks and vllm-mlx OpenAI-compatible serve checks.

Conversion Details

Field Value
Upstream model showlab/ShowUI-2B
Artifact type 6bit quantized MLX conversion
Source artifact local validated bf16 MLX artifact
Repo action update existing mlx-community repo
Conversion tool mlx_vlm.convert via mlx-vlm 0.3.12
Python 3.11.14
MLX 0.31.0
Transformers 5.2.0
Validation backend vllm-mlx (phase/p1 @ 8a5d41b)
Quantization 6bit
Group size 64
Quantization mode affine
Converter dtype note bfloat16
Reported effective bits per weight 9.088
Artifact size 2.60G
Template repair tokenizer_config.json["chat_template"] was re-injected after quantization

Additional notes:

  • This quantized artifact inherits the fresh-source posture of the validated local bf16 base artifact.
  • chat_template.json, chat_template.jinja, and tokenizer_config.json["chat_template"] were kept aligned after quantization.
  • This family was validated on the Track B packet revision aligned to ShowUI's native point/action contract.

Validation

This artifact passed local validation in this workspace:

  • mlx_vlm prompt-packet validation: PASS
  • vllm-mlx OpenAI-compatible serve validation: PASS

Local validation notes:

  • All four Track B packet prompts matched the local bf16 outputs exactly.
  • The same coordinate drift between non-stream and streamed serve outputs remained present.
  • No new regression appeared in packet shape, multimodal detection, or the serve path after quantization.

Performance

  • Artifact size on disk: 2.60G
  • Local fixed-packet mlx_vlm validation used about 4.35 GB peak memory
  • Local vllm-mlx serve validation completed in about 20.15s non-stream and 21.13s streamed

These are local validation measurements, not a full benchmark suite.

Usage

Install

pip install -U mlx-vlm

CLI

python -m mlx_vlm.generate \
  --model mlx-community/ShowUI-2B-6bit-v2 \
  --image path/to/image.png \
  --prompt "Based on the screenshot, return the clickable location for the API Host field as [x, y] on a 0-1 scale." \
  --max-tokens 128 \
  --temperature 0.0

Python

from mlx_vlm import load, generate

model, processor = load("mlx-community/ShowUI-2B-6bit-v2")
result = generate(
    model,
    processor,
    prompt="Based on the screenshot, return the clickable location for the API Host field as [x, y] on a 0-1 scale.",
    image="path/to/image.png",
    max_tokens=128,
    temp=0.0,
)
print(result.text)

vllm-mlx Serve

python -m vllm_mlx.cli serve mlx-community/ShowUI-2B-6bit-v2 --mllm --localhost --port 8000

Links

Other Quantizations

Planned sibling repos in this wave:

Notes and Limitations

  • This card reports local MLX conversion and validation results only.
  • Upstream benchmark claims belong to the original ShowUI model family and were not re-run here unless explicitly stated.
  • This family remains tied to the Track B point/action packet rather than the Track A bounding-box packet.
  • The original mlx-community/ShowUI-2B-bf16-6bit repo already existed, so this refreshed artifact is published under the -v2 repo id.

Citation

If you use this MLX conversion, please also cite the original ShowUI work:

@misc{lin2024showui,
      title={ShowUI: One Vision-Language-Action Model for GUI Visual Agent},
      author={Kevin Qinghong Lin and Linjie Li and Difei Gao and Zhengyuan Yang and Shiwei Wu and Zechen Bai and Weixian Lei and Lijuan Wang and Mike Zheng Shou},
      year={2024},
      eprint={2411.17465},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2411.17465},
}

License

This repo follows the upstream model license: MIT. See the upstream model card for the authoritative license details: showlab/ShowUI-2B.

Downloads last month
46
Safetensors
Model size
1B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

6-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mlx-community/ShowUI-2B-6bit-v2

Base model

Qwen/Qwen2-VL-2B
Quantized
(5)
this model

Paper for mlx-community/ShowUI-2B-6bit-v2