Skip to content

Comments

perf(reader): Add Parquet metadata size hint option to ArrowReaderBuilder#2173

Open
mbutrovich wants to merge 2 commits intoapache:mainfrom
mbutrovich:metadata_size_hint
Open

perf(reader): Add Parquet metadata size hint option to ArrowReaderBuilder#2173
mbutrovich wants to merge 2 commits intoapache:mainfrom
mbutrovich:metadata_size_hint

Conversation

@mbutrovich
Copy link
Collaborator

Which issue does this PR close?

What changes are included in this PR?

Add with_metadata_size_hint to ArrowReaderBuilder, allowing callers to configure the number of bytes to prefetch when reading Parquet footer metadata. No default is set—callers opt in via the builder. When unset, behavior is unchanged.

Are these changes tested?

Existing tests.

@mbutrovich mbutrovich changed the title perf(reader): Add Parquet metadata prefetch hint perf(reader): Add Parquet metadata size hint option to ArrowReaderBuilder Feb 24, 2026
@mbutrovich mbutrovich self-assigned this Feb 24, 2026
Copy link
Contributor

@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM,

so we are exposing metadata_size_hint in ArrowReader so that it can be passed to ArrowFileReader?

@mbutrovich
Copy link
Collaborator Author

Thanks for the quick review @kevinjqliu! Hopefully someone can hook this up to the front-end in the presence of information from elsewhere like Puffin files, or a config for a default value. Comet will make use of this quickly!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants