Error when loading Qwen3-VL-30B-A3B-Instruct-AWQ with transformers
#3
by
dfg543 - opened
I tried to load this model with transformers, but encountered an import error during initialization.
ImportError: cannot import name 'PytorchGELUTanh' from 'transformers.activations' (/mnt/241hdd/wzr/Qwen3-VL/.venv/lib/python3.10/site-packages/transformers/activations.py)
Full log:
(Qwen3-VL) [root@Arc-AI /mnt/241hdd/wzr/Qwen3-VL]$ python demo_awq.py
/mnt/241hdd/wzr/Qwen3-VL/.venv/lib/python3.10/site-packages/awq/__init__.py:21: DeprecationWarning:
I have left this message as the final dev message to help you transition.
Important Notice:
- AutoAWQ is officially deprecated and will no longer be maintained.
- The last tested configuration used Torch 2.6.0 and Transformers 4.51.3.
- If future versions of Transformers break AutoAWQ compatibility, please report the issue to the Transformers project.
Alternative:
- AutoAWQ has been adopted by the vLLM Project: https://github.com/vllm-project/llm-compressor
For further inquiries, feel free to reach out:
- X: https://x.com/casper_hansen_
- LinkedIn: https://www.linkedin.com/in/casper-hansen-804005170/
warnings.warn(_FINAL_DEV_MESSAGE, category=DeprecationWarning, stacklevel=1)
Traceback (most recent call last):
File "/mnt/241hdd/wzr/Qwen3-VL/demo_awq.py", line 11, in <module>
model = Qwen3VLMoeForConditionalGeneration.from_pretrained(
File "/mnt/241hdd/wzr/Qwen3-VL/.venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 277, in _wrapper
return func(*args, **kwargs)
File "/mnt/241hdd/wzr/Qwen3-VL/.venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 5001, in from_pretrained
hf_quantizer.preprocess_model(
File "/mnt/241hdd/wzr/Qwen3-VL/.venv/lib/python3.10/site-packages/transformers/quantizers/base.py", line 225, in preprocess_model
return self._process_model_before_weight_loading(model, **kwargs)
File "/mnt/241hdd/wzr/Qwen3-VL/.venv/lib/python3.10/site-packages/transformers/quantizers/quantizer_awq.py", line 119, in _process_model_before_weight_loading
model, has_been_replaced = replace_with_awq_linear(
File "/mnt/241hdd/wzr/Qwen3-VL/.venv/lib/python3.10/site-packages/transformers/integrations/awq.py", line 134, in replace_with_awq_linear
from awq.modules.linear.gemm import WQLinear_GEMM
File "/mnt/241hdd/wzr/Qwen3-VL/.venv/lib/python3.10/site-packages/awq/__init__.py", line 24, in <module>
from awq.models.auto import AutoAWQForCausalLM
File "/mnt/241hdd/wzr/Qwen3-VL/.venv/lib/python3.10/site-packages/awq/models/__init__.py", line 1, in <module>
from .mpt import MptAWQForCausalLM
File "/mnt/241hdd/wzr/Qwen3-VL/.venv/lib/python3.10/site-packages/awq/models/mpt.py", line 1, in <module>
from .base import BaseAWQForCausalLM
File "/mnt/241hdd/wzr/Qwen3-VL/.venv/lib/python3.10/site-packages/awq/models/base.py", line 49, in <module>
from awq.quantize.quantizer import AwqQuantizer
File "/mnt/241hdd/wzr/Qwen3-VL/.venv/lib/python3.10/site-packages/awq/quantize/quantizer.py", line 11, in <module>
from awq.quantize.scale import apply_scale, apply_clip
File "/mnt/241hdd/wzr/Qwen3-VL/.venv/lib/python3.10/site-packages/awq/quantize/scale.py", line 12, in <module>
from transformers.activations import NewGELUActivation, PytorchGELUTanh, GELUActivation
ImportError: cannot import name 'PytorchGELUTanh' from 'transformers.activations' (/mnt/241hdd/wzr/Qwen3-VL/.venv/lib/python3.10/site-packages/transformers/activations.py)
Here is my code
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0,1"
import torch
from transformers import Qwen3VLMoeForConditionalGeneration, AutoProcessor
model_path = "/mnt/241hdd/wzr/hub/QuantTrio/Qwen3-VL-30B-A3B-Instruct-AWQ"
# model_path = "/mnt/241hdd/wzr/hub/cpatonn/Qwen3-VL-30B-A3B-Instruct-AWQ-4bit"
model = Qwen3VLMoeForConditionalGeneration.from_pretrained(
model_path,
device_map="auto",
dtype=torch.float16,
)
processor = AutoProcessor.from_pretrained(model_path)
prompt = """你是一个食品和营养学领域的专家...(你的原始prompt内容)"""
messages = [
{
"role": "user",
"content": [
{"type": "image", "image": "image_049_1024.jpg"},
{"type": "text", "text": prompt},
],
}
]
inputs = processor.apply_chat_template(
messages,
tokenize=True,
add_generation_prompt=True,
return_dict=True,
return_tensors="pt",
)
generated_ids = model.generate(**inputs, max_new_tokens=2048)
generated_ids_trimmed = [
out_ids[len(in_ids) :] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
]
output_text = processor.batch_decode(
generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False
)
print(output_text[0])
Dependencies
dependencies = [
"accelerate>=1.10.1",
"autoawq==0.2.9",
"av==15.1.0",
"compressed-tensors>=0.12.2",
"flash-attn>=2.8.3",
"gguf>=0.17.1",
"gradio==5.46.1",
"gradio-client==1.13.1",
"ipykernel>=7.0.0",
"llama-cpp-python>=0.3.16",
"mistral-common>=1.8.5",
"modelscope>=1.30.0",
"openpyxl>=3.1.5",
"qwen-vl-utils",
"sentencepiece>=0.2.1",
"setuptools>=80.9.0",
"torch==2.5.1",
"torchaudio==2.5.1",
"torchvision==0.20.1",
"transformers==4.57.0",
"transformers-stream-generator==0.0.5",
]
Confirmed (also for latest transformers==4.57.0), with the same error message:failed: cannot import name 'PytorchGELUTanh' from 'transformers.activations' (/opt/conda/lib/python3.11/site-packages/transformers/activations.py)
This repo is meant to be loaded with vLLM. Raw transformers not tested.