Skip to content

Use serialized metadata size to calculate the cache entry cell#1484

Merged
zvonand merged 2 commits intoantalya-26.1from
fix_parquet_metadata_cache_max_size
Mar 8, 2026
Merged

Use serialized metadata size to calculate the cache entry cell#1484
zvonand merged 2 commits intoantalya-26.1from
fix_parquet_metadata_cache_max_size

Conversation

@arthurpassos
Copy link
Collaborator

@arthurpassos arthurpassos commented Mar 6, 2026

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Quick workaround for Altinity/clickhouse-regression#114

Documentation entry for user-facing changes

...

CI/CD Options

Exclude tests:

  • Fast test
  • Integration Tests
  • Stateless tests
  • Stateful tests
  • Performance tests
  • All with ASAN
  • All with TSAN
  • All with MSAN
  • All with UBSAN
  • All with Coverage
  • All with Aarch64
  • All Regression
  • Disable CI Cache

Regression jobs to run:

  • Fast suites (mostly <1h)
  • Aggregate Functions (2h)
  • Alter (1.5h)
  • Benchmark (30m)
  • ClickHouse Keeper (1h)
  • Iceberg (2h)
  • LDAP (1h)
  • Parquet (1.5h)
  • RBAC (1.5h)
  • SSL Server (1h)
  • S3 (2h)
  • Tiered Storage (2h)

@arthurpassos arthurpassos added port-antalya PRs to be ported to all new Antalya releases antalya-26.1 labels Mar 6, 2026
@github-actions
Copy link

github-actions bot commented Mar 6, 2026

Workflow [PR], commit [20b1d70]

Copy link

@ianton-ru ianton-ru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


size_t ParquetFileMetaDataWeightFunction::operator()(const parquet::FileMetaData & metadata) const
{
/// TODO fix-me: using the size on disk is not ideal, but it is the simplest and best we can do for now.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not ideal? As comment in CacheBase says, functor must return "approximate size" of the data.
Or it is mean that file size is not equal to used disk size and need increase to the file system block size?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the size matches the size on disk. I say not ideal because imo the ideal would be the in memory data structure sizes

@zvonand zvonand merged commit bbabcaa into antalya-26.1 Mar 8, 2026
558 of 594 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

antalya-26.1 port-antalya PRs to be ported to all new Antalya releases

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants