-
Notifications
You must be signed in to change notification settings - Fork 467
Integrate mHC with DeepSeek custom model #3115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
60b389b to
d629c9c
Compare
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
8a7e2a0 to
5b938c0
Compare
5b938c0 to
0e14fe2
Compare
|
🤖 Hi @RissyRan, I've received your request, and I'm working on it now! You can track my progress in the logs for more details. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
📋 Review Summary
This Pull Request integrates Manifold-Constrained Hyper Connections (mHC) with the DeepSeek custom model. The changes involve modifying decoder layers, deepseek layers, and the mhc implementation itself, along with updates to configuration and unit tests. The implementation appears sound, addressing the core objective of integrating mHC.
🔍 General Feedback
- The refactoring in
src/MaxText/layers/mhc.pyto explicitly useself.dtypeandself.matmul_precisionis a good improvement for clarity and consistency in precision handling. - The updated unit tests in
tests/unit/mhc_test.pyadequately cover the new return values from the mHC module. - The addition of an AOT compilation test for mHC integration is a positive step towards ensuring long-term stability and correct compilation.
1436770 to
2fdb6e1
Compare
2fdb6e1 to
222286f
Compare
Description
decoders.pywhen feature is enabled.General pre-norm when mHC feature is disabled:
Pre-norml when mHC feature is enabled (ref):
Tests
Checklist
Before submitting this PR, please make sure (put X in square brackets):
gemini-reviewlabel.