Error when loading Qwen3-VL-30B-A3B-Instruct-AWQ with transformers

#3
by dfg543 - opened

I tried to load this model with transformers, but encountered an import error during initialization.

ImportError: cannot import name 'PytorchGELUTanh' from 'transformers.activations' (/mnt/241hdd/wzr/Qwen3-VL/.venv/lib/python3.10/site-packages/transformers/activations.py)

Full log:

(Qwen3-VL) [root@Arc-AI /mnt/241hdd/wzr/Qwen3-VL]$ python demo_awq.py 
/mnt/241hdd/wzr/Qwen3-VL/.venv/lib/python3.10/site-packages/awq/__init__.py:21: DeprecationWarning: 
I have left this message as the final dev message to help you transition.

Important Notice:
- AutoAWQ is officially deprecated and will no longer be maintained.
- The last tested configuration used Torch 2.6.0 and Transformers 4.51.3.
- If future versions of Transformers break AutoAWQ compatibility, please report the issue to the Transformers project.

Alternative:
- AutoAWQ has been adopted by the vLLM Project: https://github.com/vllm-project/llm-compressor

For further inquiries, feel free to reach out:
- X: https://x.com/casper_hansen_
- LinkedIn: https://www.linkedin.com/in/casper-hansen-804005170/

  warnings.warn(_FINAL_DEV_MESSAGE, category=DeprecationWarning, stacklevel=1)
Traceback (most recent call last):
  File "/mnt/241hdd/wzr/Qwen3-VL/demo_awq.py", line 11, in <module>
    model = Qwen3VLMoeForConditionalGeneration.from_pretrained(
  File "/mnt/241hdd/wzr/Qwen3-VL/.venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 277, in _wrapper
    return func(*args, **kwargs)
  File "/mnt/241hdd/wzr/Qwen3-VL/.venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 5001, in from_pretrained
    hf_quantizer.preprocess_model(
  File "/mnt/241hdd/wzr/Qwen3-VL/.venv/lib/python3.10/site-packages/transformers/quantizers/base.py", line 225, in preprocess_model
    return self._process_model_before_weight_loading(model, **kwargs)
  File "/mnt/241hdd/wzr/Qwen3-VL/.venv/lib/python3.10/site-packages/transformers/quantizers/quantizer_awq.py", line 119, in _process_model_before_weight_loading
    model, has_been_replaced = replace_with_awq_linear(
  File "/mnt/241hdd/wzr/Qwen3-VL/.venv/lib/python3.10/site-packages/transformers/integrations/awq.py", line 134, in replace_with_awq_linear
    from awq.modules.linear.gemm import WQLinear_GEMM
  File "/mnt/241hdd/wzr/Qwen3-VL/.venv/lib/python3.10/site-packages/awq/__init__.py", line 24, in <module>
    from awq.models.auto import AutoAWQForCausalLM
  File "/mnt/241hdd/wzr/Qwen3-VL/.venv/lib/python3.10/site-packages/awq/models/__init__.py", line 1, in <module>
    from .mpt import MptAWQForCausalLM
  File "/mnt/241hdd/wzr/Qwen3-VL/.venv/lib/python3.10/site-packages/awq/models/mpt.py", line 1, in <module>
    from .base import BaseAWQForCausalLM
  File "/mnt/241hdd/wzr/Qwen3-VL/.venv/lib/python3.10/site-packages/awq/models/base.py", line 49, in <module>
    from awq.quantize.quantizer import AwqQuantizer
  File "/mnt/241hdd/wzr/Qwen3-VL/.venv/lib/python3.10/site-packages/awq/quantize/quantizer.py", line 11, in <module>
    from awq.quantize.scale import apply_scale, apply_clip
  File "/mnt/241hdd/wzr/Qwen3-VL/.venv/lib/python3.10/site-packages/awq/quantize/scale.py", line 12, in <module>
    from transformers.activations import NewGELUActivation, PytorchGELUTanh, GELUActivation
ImportError: cannot import name 'PytorchGELUTanh' from 'transformers.activations' (/mnt/241hdd/wzr/Qwen3-VL/.venv/lib/python3.10/site-packages/transformers/activations.py)

Here is my code

import os

os.environ["CUDA_VISIBLE_DEVICES"] = "0,1"

import torch
from transformers import Qwen3VLMoeForConditionalGeneration, AutoProcessor

model_path = "/mnt/241hdd/wzr/hub/QuantTrio/Qwen3-VL-30B-A3B-Instruct-AWQ"
# model_path = "/mnt/241hdd/wzr/hub/cpatonn/Qwen3-VL-30B-A3B-Instruct-AWQ-4bit"

model = Qwen3VLMoeForConditionalGeneration.from_pretrained(
    model_path,
    device_map="auto",
    dtype=torch.float16,
)

processor = AutoProcessor.from_pretrained(model_path)

prompt = """你是一个食品和营养学领域的专家...(你的原始prompt内容)"""

messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "image": "image_049_1024.jpg"},
            {"type": "text", "text": prompt},
        ],
    }
]

inputs = processor.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    return_dict=True,
    return_tensors="pt",
)

generated_ids = model.generate(**inputs, max_new_tokens=2048)
generated_ids_trimmed = [
    out_ids[len(in_ids) :] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
]
output_text = processor.batch_decode(
    generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False
)
print(output_text[0])

Dependencies

dependencies = [
    "accelerate>=1.10.1",
    "autoawq==0.2.9",
    "av==15.1.0",
    "compressed-tensors>=0.12.2",
    "flash-attn>=2.8.3",
    "gguf>=0.17.1",
    "gradio==5.46.1",
    "gradio-client==1.13.1",
    "ipykernel>=7.0.0",
    "llama-cpp-python>=0.3.16",
    "mistral-common>=1.8.5",
    "modelscope>=1.30.0",
    "openpyxl>=3.1.5",
    "qwen-vl-utils",
    "sentencepiece>=0.2.1",
    "setuptools>=80.9.0",
    "torch==2.5.1",
    "torchaudio==2.5.1",
    "torchvision==0.20.1",
    "transformers==4.57.0",
    "transformers-stream-generator==0.0.5",
]

Confirmed (also for latest transformers==4.57.0), with the same error message:
failed: cannot import name 'PytorchGELUTanh' from 'transformers.activations' (/opt/conda/lib/python3.11/site-packages/transformers/activations.py)

QuantTrio org

This repo is meant to be loaded with vLLM. Raw transformers not tested.

Sign up or log in to comment