[GLUTEN-10933][VL] feat: Support cached the batches in cpu cache by jinchengchenghh · Pull Request #11758 · apache/gluten

jinchengchenghh · 2026-03-13T11:47:16Z

Cache the batch in cpu cache, and wait for the join threads to fetch one by one, the build threads will start to fetch as soon as possible, but the probe thread need to wait for build finished.
The buffer size is controlled by spark.gluten.sql.columnar.backend.velox.cudf.shuffleMaxPrefetchBytes temporally, the size may be changed by the remaining memory in the server.

Test:
Test in local SF100, adjust the config to enable caching batch.

--conf spark.gluten.sql.columnar.backend.velox.cudf.batchSize=10000 \
--conf spark.gluten.sql.columnar.backend.velox.cudf.shuffleMaxPrefetchBytes=1024MB

The log prints Prefetched 171 batches (24057900 bytes) before blocking on GPU lock

Next step:
Prefetch the probe side batch when build starts.

Related issue: #10933

github-actions bot added VELOX DOCS labels Mar 13, 2026

jinchengchenghh requested a review from marin-ma March 13, 2026 13:06

[GLUTEN-10933][VL] feat: Support cached the batches in cpu cache

a2e1099

jinchengchenghh force-pushed the cudf_shuffle_up branch from 7963eb4 to a2e1099 Compare March 13, 2026 14:16

marin-ma approved these changes Mar 13, 2026

View reviewed changes

jinchengchenghh merged commit 8e8a81e into apache:main Mar 13, 2026
61 of 62 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GLUTEN-10933][VL] feat: Support cached the batches in cpu cache#11758

[GLUTEN-10933][VL] feat: Support cached the batches in cpu cache#11758
jinchengchenghh merged 1 commit intoapache:mainfrom
jinchengchenghh:cudf_shuffle_up

jinchengchenghh commented Mar 13, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jinchengchenghh commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jinchengchenghh commented Mar 13, 2026 •

edited

Loading