Skip to content

Fix flaky test_idle_heartbeat and CI workflow YAML syntax#699

Open
dkropachev wants to merge 2 commits intomasterfrom
fix/idle-heartbeat-and-ci-yaml
Open

Fix flaky test_idle_heartbeat and CI workflow YAML syntax#699
dkropachev wants to merge 2 commits intomasterfrom
fix/idle-heartbeat-and-ci-yaml

Conversation

@dkropachev
Copy link
Collaborator

@dkropachev dkropachev commented Feb 13, 2026

Summary

  • Fix test_idle_heartbeat flaky failure (KeyError) caused by shard-aware reconnection replacing connections during the sleep interval. The test now skips connections that weren't in the original snapshot.
  • Fix trailing whitespace/dash in integration-tests.yml paths-ignore list that caused a YAML syntax issue.

Test plan

  • Verify test_idle_heartbeat no longer fails with KeyError in CI
  • Verify integration test workflow triggers correctly on relevant file changes

The test_idle_heartbeat test could fail with a KeyError when
shard-aware reconnection replaced connections during the sleep
interval. Skip connections not present in the original snapshot.

Also fix a trailing whitespace/dash in integration-tests.yml
that caused a YAML syntax issue in the paths-ignore list.
@dkropachev dkropachev force-pushed the fix/idle-heartbeat-and-ci-yaml branch from 6357f5e to d1ae102 Compare February 13, 2026 21:52
…rtion

Two issues caused this test to be flaky:

1. wait_for_all_pools only waits for the first connection per host.
   Shard-aware connections to remaining shards are opened asynchronously.
   When these complete during the test's sleep interval, they replace
   existing connections causing KeyError on the request_ids snapshot.
   Fix: add a helper that polls until all shard connections are
   established, called after connect().

2. execute_concurrent sent only len(hosts) queries, but with shard-aware
   routing each query hits one specific shard, leaving other shards'
   connections idle. The assertion that ALL connections are non-idle
   then fails.
   Fix: send more queries (2x num_connections) and relax assertion to
   check that at least some non-control connections became non-idle.
@dkropachev dkropachev force-pushed the fix/idle-heartbeat-and-ci-yaml branch from d1ae102 to be5d3f0 Compare February 13, 2026 23:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant