-
Notifications
You must be signed in to change notification settings - Fork 467
Qwen3-Omni Checkpoint Mapping #3096
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
7b2320f to
1b9c38b
Compare
1b9c38b to
4b08f45
Compare
hengtaoguo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if you have tried the full checkpoint conversion? Are all layers filled with weights? Not sure if its ready to decode now, seems we are pending two more PRs.
| if self.config.use_multimodal and encoder_images is not None: | ||
| image_embeddings = self.vision_encoder(input_images=encoder_images, deterministic=not enable_dropout) | ||
| # qwen3-omni-30b-a3b returns deep features from the vision encoder. | ||
| image_embeddings, _ = self.vision_encoder(input_images=encoder_images, deterministic=not enable_dropout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you changed the output format of encoder, could you verify if all other references have been updated too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we update return var number where vision_encoder is called, such as https://github.com/AI-Hypercomputer/maxtext/blob/main/src/MaxText/layers/models.py#L462
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also the returned deep_feats should not be _ ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aireenmei it cannot be used right now, it requires the next PR in our list: #2729 thus I'm conforming to the order.
| embeddings = encoder(input_images, deterministic=deterministic) | ||
| encoder_output = encoder(input_images, deterministic=deterministic) | ||
|
|
||
| deep_feats = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hengtaoguo instead of modifying all references, this is how its done to return a encoder_output and deep_feats which will be None for the other models.
Co-authored-by: Eitan Porat <eporat@lightricks.com>
4b08f45 to
77b061e
Compare
Original author @eitanporat in #2728
Description
Checkpoint conversion mapping from hf
Tests
Checklist
Before submitting this PR, please make sure (put X in square brackets):
gemini-reviewlabel.