Skip to content

[Bug] Unable to create Qwen3 MoE model #444

@casper-hansen

Description

@casper-hansen

How do I get ART to instantiate the model with FastModel instead of FastLanguageModel in Unsloth (Unsloth docs says: If you're fine-tuning the MOE models, please use FastModel and not FastLanguageModel)? I seem to be running into a model loading issue as seen from the error.

model = art.TrainableModel(
    name="001-script",
    project="testing",
    base_model="Qwen/Qwen3-30B-A3B-Thinking-2507",
    _internal_config=art.dev.InternalModelConfig(  
        init_args=art.dev.InitArgs(
            load_in_4bit=False,
            max_seq_length=65536,
        ),  
        engine_args=art.dev.EngineArgs(  
            max_model_len=65536,
            tensor_parallel_size=8,
            gpu_memory_utilization=0.75,
        ),  
    ),  
)
await model.register(backend)

Error:

Traceback (most recent call last):
  File "/root/openpipe/train.py", line 337, in <module>
    asyncio.run(train())
  File "/root/openpipe/.venv/lib/python3.11/site-packages/nest_asyncio.py", line 30, in run
    return loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/openpipe/.venv/lib/python3.11/site-packages/nest_asyncio.py", line 98, in run_until_complete
    return f.result()
           ^^^^^^^^^^
  File "/usr/lib/python3.11/asyncio/futures.py", line 203, in result
    raise self._exception.with_traceback(self._exception_tb)
  File "/usr/lib/python3.11/asyncio/tasks.py", line 277, in __step
    result = coro.send(None)
             ^^^^^^^^^^^^^^^
  File "/root/openpipe/train.py", line 296, in train
    await model.register(backend)
  File "/root/openpipe/.venv/lib/python3.11/site-packages/art/model.py", line 335, in register
    base_url, api_key = await backend._prepare_backend_for_training(
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/openpipe/.venv/lib/python3.11/site-packages/art/local/backend.py", line 282, in _prepare_backend_for_training
    await service.start_openai_server(config=config)
  File "/root/openpipe/.venv/lib/python3.11/site-packages/mp_actors/traceback.py", line 26, in async_wrapper
    raise e.with_traceback(streamlined_traceback())
  File "/root/openpipe/.venv/lib/python3.11/site-packages/art/unsloth/service.py", line 60, in start_openai_server
    self.state.trainer.save_model(lora_path)
^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/functools.py", line 1001, in __get__
    val = self.func(instance)
^^^^^^^
  File "/root/openpipe/.venv/lib/python3.11/site-packages/art/unsloth/service.py", line 45, in state
    return ModelState(self.config)
  ^^^^^^^^^^^^^^^^^
  File "/root/openpipe/.venv/lib/python3.11/site-packages/art/unsloth/state.py", line 82, in __init__
    unsloth.FastLanguageModel.from_pretrained(**config.get("init_args", {})),
^^^^^^^^^^^^^^^
  File "/root/openpipe/.venv/lib/python3.11/site-packages/unsloth/models/loader.py", line 397, in from_pretrained
    return FastModel.from_pretrained(
^^^^^^^^^^^^^^^
  File "/root/openpipe/.venv/lib/python3.11/site-packages/unsloth/models/loader.py", line 930, in from_pretrained
    model, tokenizer = FastBaseModel.from_pretrained(
  ^^^^^^^^^^^^^^^^^
  File "/root/openpipe/.venv/lib/python3.11/site-packages/unsloth/models/vision.py", line 621, in from_pretrained
    _, quant_state_dict = get_vllm_state_dict(
^^^^^^^^^^^^^^^
  File "/root/openpipe/.venv/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
^^^^^^^^^^^^^^^
  File "/root/openpipe/.venv/lib/python3.11/site-packages/unsloth_zoo/vllm_utils.py", line 960, in get_vllm_state_dict
    proj = layer.mlp.gate_up_proj
  ^^^^^^^^^^^^^^^^^
  File "/root/openpipe/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1940, in __getattr__
    raise AttributeError(
  ^^^^^^^^^^^^^^^^^
AttributeError: 'Qwen3MoeSparseMoeBlock' object has no attribute 'gate_up_proj'

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions