-
Notifications
You must be signed in to change notification settings - Fork 16
Description
Summary
The audit review of PR #1453 detected a defect in Iceberg sort-key expression generation.
When iceberg_partition_timezone is set and a temporal Iceberg transform is used in sort metadata (day, month, year, hour), timezone is appended as a raw token instead of a SQL string literal in getSortingKeyDescriptionFromMetadata().
Simplified example of generated expression:
toRelativeDayNum(column, UTC)
Expected:
toRelativeDayNum(column, 'UTC')
Without quoting, the timezone token is interpreted by the SQL parser as an identifier instead of a string literal, which can break KeyDescription::parse(...).
Affected area
src/Storages/ObjectStorage/DataLakes/Iceberg/Utils.cpp- Function:
getSortingKeyDescriptionFromMetadata() - Discovered during audit review of PR 26.1 Antalya port - Timezone for partitioning #1453
Impact
Medium (correctness/reliability in planning path):
- Queries reading Iceberg tables with transformed sort-order metadata may fail during sort-key parsing/planning when timezone override is enabled.
- Affects read/planning path for Iceberg metadata consumers using transformed sort keys.
Code evidence
full_argument = clickhouse_transform_name->transform_name + "(";
if (clickhouse_transform_name->argument)
{
full_argument += std::to_string(*clickhouse_transform_name->argument) + ", ";
}
full_argument += column_name;
if (clickhouse_transform_name->time_zone)
full_argument += ", " + *clickhouse_transform_name->time_zone;
full_argument += ")";order_by_str.pop_back();
return KeyDescription::parse(order_by_str, column_description, local_context, true);Reproduction sketch
- Configure
iceberg_partition_timezone='UTC'. - Use/read an Iceberg table with temporal transformed sort metadata (e.g.
day(ts),month(ts), etc.). - Execute a query that triggers Iceberg sort-key planning.
- Observe parse/planning failure due to unquoted timezone argument in generated expression.
Expected behavior
Timezone argument should be serialized as a SQL string literal (properly quoted/escaped), or generated via AST without raw string concatenation.
Suggested fix direction
- Quote/escape timezone before appending to
full_argument, or - Build the expression as AST instead of SQL string concatenation.
Suggested regression test
Add a test for Iceberg transformed sort-order + non-empty iceberg_partition_timezone and assert sort-key description parsing/planning succeeds.