Skip to content

Fix race condition in shader validation tests#4597

Open
diegodelatoba wants to merge 1 commit intogpuweb:mainfrom
diegodelatoba:fix-shader-validation-race-condition
Open

Fix race condition in shader validation tests#4597
diegodelatoba wants to merge 1 commit intogpuweb:mainfrom
diegodelatoba:fix-shader-validation-race-condition

Conversation

@diegodelatoba
Copy link

@diegodelatoba diegodelatoba commented Feb 17, 2026

Changes:

  • Modified both expectCompileResult implementations to use a single
    eventualAsyncExpectation that executes GPU validation and compilation
    checks sequentially
  • Changed let shaderModule to const shaderModule for better code quality
  • Removed unnecessary non-null assertion operators

Testing:

Verified in WebKit with 500 consecutive test runs with no failures (previously
failed around iteration 15-304).

Note: This is a test infrastructure bug fix, not related to a spec change.
No CTS project tracker issue is required.


Requirements for PR author:

  • All missing test coverage is tracked with "TODO" or .unimplemented().
    • N/A - This is a test infrastructure fix, not adding new test coverage
  • New helpers are /** documented */ and new helper files are found in helper_index.txt.
    • N/A - No new helpers added
  • Test behaves as expected in a WebGPU implementation. (If not passing, explain above.)
    • Yes - Tested extensively in WebKit with 500+ consecutive successful runs
  • Test have be tested with compatibility mode validation enabled and behave as expected. (If not passing, explain above.)
    • Yes - All affected tests pass in WebKit's test suite

Requirements for reviewer sign-off:

  • Tests are properly located.
  • Test descriptions are accurate and complete.
  • Tests provide complete coverage (including validation control cases). Missing coverage MUST be covered by TODOs.
  • Tests avoid over-parameterization (see case count report).

When landing this PR, be sure to make any necessary issue status updates.

  Combines GPU error scope validation and compilation info checks into a single
  sequential async operation to prevent non-deterministic error message ordering.

  This fixes intermittent test failures where error messages would appear in
  different orders across test runs due to parallel async execution.
@mwyrzykowski
Copy link
Contributor

@greggman is there a race in the current logic? We see errors print in different orders based on promise resolution and the spec says:

https://www.w3.org/TR/webgpu/#dom-gpudevice-poperrorscope says:

There is no guarantee of the ordering of promise resolution.

or am I missing something and is the test currently guaranteed to be deterministic?

@greggman
Copy link
Contributor

I'm not seeing the race but I'm terrible at seeing races 😅

The first check just expects that eventually the compilation validation succeeds or fails. Regardless, it will immediately and synchronously return a shader module.

The 2nd check is that you can call shaderModule.getComplationInfo and eventually get a result, and, if you expected an error at least one message is an error message.

I don't see a race about which order those are expected happen. The test seems like it allows them happening in any order

@diegodelatoba where do you see the race?

@diegodelatoba
Copy link
Author

I'm not seeing the race but I'm terrible at seeing races 😅

The first check just expects that eventually the compilation validation succeeds or fails. Regardless, it will immediately and synchronously return a shader module.

The 2nd check is that you can call shaderModule.getComplationInfo and eventually get a result, and, if you expected an error at least one message is an error message.

I don't see a race about which order those are expected happen. The test seems like it allows them happening in any order

@diegodelatoba where do you see the race?

@greggman From my understanding the race isn't about which check happens first since those can happen in any order. Perhaps "race" was not the correct wording here but the issue here is that currently there is non-deterministic error message printing in the test output.

What I found is that expectGPUError and eventualAsyncExpectation both write to the test recorder (this.rec.debug() / this.rec.validationFailed()). When they run in parallel, their output messages can interleave non-deterministically. This causes issues in WebKit's test suite which relies on deterministic error logging.

@greggman
Copy link
Contributor

I'm not seeing the race but I'm terrible at seeing races 😅
The first check just expects that eventually the compilation validation succeeds or fails. Regardless, it will immediately and synchronously return a shader module.
The 2nd check is that you can call shaderModule.getComplationInfo and eventually get a result, and, if you expected an error at least one message is an error message.
I don't see a race about which order those are expected happen. The test seems like it allows them happening in any order
@diegodelatoba where do you see the race?

@greggman From my understanding the race isn't about which check happens first since those can happen in any order. Perhaps "race" was not the correct wording here but the issue here is that currently there is non-deterministic error message printing in the test output.

What I found is that expectGPUError and eventualAsyncExpectation both write to the test recorder (this.rec.debug() / this.rec.validationFailed()). When they run in parallel, their output messages can interleave non-deterministically. This causes issues in WebKit's test suite which relies on deterministic error logging.

ok, that makes more sense to me. I wouldn't expect to see logs get interleaved, there's still only one JS thread. But, the order individual tests get logged might be different.

I wonder if it would be better to order the outputs at a lower level. As it is, it looks like the code calls this.rec.validationFailed and similar things and those can get called in any order. Maybe we can do something there to handle this in a generic way everywhere? I don't know how many other places this might come up.

throwing out ideas

  1. remove this.rec and require calling t.getRec which you'd have to call outside a callback

    // OLD
    this.eventualAsyncExpectation(async () => {
      ...
      this.rec.validationFailed(error);
    );
    // new
    const rec = this.getRec();
    this.eventualAsyncExpectation(async () => {
      ...
      rec.validationFailed(error);
    );

    This would allow getRec to return things that output in order.

    Or along the same lines, make eventualAsyncExpectation pass in a rec

    // new
    this.eventualAsyncExpectation(rec, async () => {
      ...
      rec.validationFailed(error);
    );
  2. change eventualAsyncExpectation so it just adds functions to a list of things to do later and then calls them in order and awaits each one. Currently it calls the function immediately and puts the promise in an array to wait on. But maybe it could be changed put the function on an array then call each one in order.

I'm leaning toward something more like 1. There aren't that many calls to this.rec so maybe it would be a relatively small change to add internal order ids.

Or maybe we can just make this change for this test. I'd be nice to get @kainino0x's feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments