Skip to content

[PyTorch] Make grouped weights opt-in#2678

Merged
ksivaman merged 2 commits intoNVIDIA:mainfrom
ksivaman:opt_in_grouped_weights
Feb 13, 2026
Merged

[PyTorch] Make grouped weights opt-in#2678
ksivaman merged 2 commits intoNVIDIA:mainfrom
ksivaman:opt_in_grouped_weights

Conversation

@ksivaman
Copy link
Member

Description

Follow-up to #2654 that makes grouped weights opt-in via an envvar. The variable name is inaccurate for now but impending work will convert this opt-in option to use only a single-parameter.

Type of change

  • Documentation change (change only to the documentation, either a fix or a new content)
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Infra/Build change
  • Code refactoring

Changes

  • Make grouped weights opt-in.

Checklist:

  • I have read and followed the contributing guidelines
  • The functionality is complete
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Signed-off-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com>
@ksivaman ksivaman requested a review from zhongbozhu February 13, 2026 04:50
@ksivaman
Copy link
Member Author

/te-ci pytorch

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 13, 2026

Greptile Overview

Greptile Summary

This PR makes the grouped weights feature opt-in by checking the environment variable NVTE_ALLOC_CONTIGUOUS_GROUPED_LINEAR_WEIGHTS. When enabled, GroupedLinear will allocate weights in contiguous memory using GroupedTensor, otherwise it uses the default separate weight allocation.

Changes:

  • Modified reset_parameters() in GroupedLinear to conditionally call make_grouped_weights() based on the envvar
  • Added single_param test parameter to verify both grouped and non-grouped modes work correctly
  • Test now sets/unsets the envvar to test the opt-in behavior

Note: The PR description mentions the variable name is temporary and will be replaced with a single-parameter option in future work.

Confidence Score: 4/5

  • This PR is safe to merge with minimal risk
  • The changes are straightforward and well-tested. The implementation correctly gates the grouped weights feature behind an environment variable. The test coverage is comprehensive with the new single_param parameter testing both code paths. The only concern is the environment variable cleanup pattern in tests which lacks robustness (already noted in previous review thread).
  • No files require special attention

Important Files Changed

Filename Overview
transformer_engine/pytorch/module/grouped_linear.py Added environment variable check to make grouped weights opt-in via NVTE_ALLOC_CONTIGUOUS_GROUPED_LINEAR_WEIGHTS. The implementation is straightforward and correct.
tests/pytorch/test_sanity.py Added single_param test parameter to test both grouped and non-grouped weight modes. Environment variable cleanup lacks try-finally protection for robustness.

Sequence Diagram

sequenceDiagram
    participant Test as test_sanity_grouped_linear
    participant Env as Environment
    participant GL as GroupedLinear
    participant GT as GroupedTensor
    
    alt single_param=True
        Test->>Env: Set NVTE_ALLOC_CONTIGUOUS_GROUPED_LINEAR_WEIGHTS=1
    end
    
    Test->>GL: Create GroupedLinear instance
    GL->>GL: __init__() registers weight parameters
    GL->>GL: reset_parameters()
    
    alt Environment variable set
        GL->>Env: Check NVTE_ALLOC_CONTIGUOUS_GROUPED_LINEAR_WEIGHTS
        Env-->>GL: "1" (enabled)
        GL->>GL: make_grouped_weights()
        GL->>GT: Create contiguous GroupedTensor storage
        GT-->>GL: Return grouped tensors
        GL->>GL: Re-register weights as GroupedTensor views
    else Environment variable not set
        GL->>Env: Check NVTE_ALLOC_CONTIGUOUS_GROUPED_LINEAR_WEIGHTS
        Env-->>GL: "0" (disabled)
        Note over GL: Skip make_grouped_weights(), use separate weight tensors
    end
    
    Test->>GL: Forward and backward pass
    GL-->>Test: Results
    
    alt single_param=True
        Test->>Env: Delete NVTE_ALLOC_CONTIGUOUS_GROUPED_LINEAR_WEIGHTS
    end
Loading

Last reviewed commit: 197f2e6

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment on lines 798 to 799
if bool(int(os.getenv("NVTE_GROUPED_LINEAR_SINGLE_PARAM", "0"))):
self.make_grouped_weights(defer_init=defer_init)
Copy link
Collaborator

@timmoon10 timmoon10 Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're not actually exposing a single param, but rather we're sneakily allocating the weights to be contiguous.

Suggested change
if bool(int(os.getenv("NVTE_GROUPED_LINEAR_SINGLE_PARAM", "0"))):
self.make_grouped_weights(defer_init=defer_init)
if bool(int(os.getenv("NVTE_ALLOC_CONTIGUOUS_GROUPED_LINEAR_WEIGHTS", "0"))):
self.make_grouped_weights(defer_init=defer_init)

Constructor kwargs are also much more Pythonic than these hidden envvars. It's not terrible for now since this is an internal implementation detail, but we should expose a proper kwarg when we actually add the option for a single param.

zhongbozhu
zhongbozhu previously approved these changes Feb 13, 2026
Copy link
Collaborator

@zhongbozhu zhongbozhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Signed-off-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com>
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

super().reset_parameters(defer_init=defer_init)
self.make_grouped_weights(defer_init=defer_init)
# Grouped tensor weights is an opt-in feature.
if bool(int(os.getenv("NVTE_ALLOC_CONTIGUOUS_GROUPED_LINEAR_WEIGHTS", "0"))):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This parsing will raise ValueError if the envvar is set to non-numeric values like "true", "false", or empty string. Consider more robust parsing if users might set this directly:

Suggested change
if bool(int(os.getenv("NVTE_ALLOC_CONTIGUOUS_GROUPED_LINEAR_WEIGHTS", "0"))):
if os.getenv("NVTE_ALLOC_CONTIGUOUS_GROUPED_LINEAR_WEIGHTS", "0").lower() in ("1", "true", "yes"):

@ksivaman ksivaman merged commit f844905 into NVIDIA:main Feb 13, 2026
10 of 12 checks passed
@ksivaman ksivaman deleted the opt_in_grouped_weights branch February 13, 2026 06:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants