Better testing for muon by segyges · Pull Request #30 · microsoft/dion

segyges · 2026-03-07T02:30:30Z

Test Newton-Schulz kernels against fp64 reference instead of cuBLAS

The previous tests only checked how close the Triton Newton-Schulz result was to cuBLAS. This broke on my machine. These tests run Triton and cuBLAS against a numpy fp64 reference reveals that the Triton kernels are at least as accurate as cuBLAS for bf16/f16, and only marginally worse for f32.

Notably, the Triton output is bit-exact with torch.bmm on batched inputs, so they use the same reduction order and unbatched torch takes a different reduction path.

Changes

Triton kernels (ns_line_1, ns_line_2): Add INPUT_PRECISION parameter — uses "ieee" for f32 inputs, "tf32" otherwise. This avoids silent mantissa truncation when the kernels are used standalone with f32 data. NOOP when running bf16, these compilation paths will simply never be hit.
Tests: Rewritten to compare Triton and cuBLAS against a numpy fp64 ground truth. For bf16/f16 the Triton kernels must match or beat cuBLAS. For f32, empirically-determined multipliers account for the reduction-tree gap. f16 coverage added.
End-to-end test: Tolerance tightened from 0.1 to 0.02 (empirical max ~7.8e-3).

segyges · 2026-03-07T02:55:55Z

@microsoft-github-policy-service agree company="Overworld"

segyges added 2 commits March 6, 2026 18:17

Better testing for muon

af2b71b

Make one of the test comparisons a tighter bound

c598fe5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better testing for muon#30

Better testing for muon#30
segyges wants to merge 2 commits intomicrosoft:mainfrom
segyges:swap-tests-against-f64-in-cpu

segyges commented Mar 7, 2026

Uh oh!

segyges commented Mar 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

segyges commented Mar 7, 2026

Test Newton-Schulz kernels against fp64 reference instead of cuBLAS

Changes

Uh oh!

segyges commented Mar 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant