Skip to content

Iceberg sort-key transform uses unquoted timezone, causing parse/planning failures with iceberg_partition_timezone #1487

@CarlosFelipeOR

Description

@CarlosFelipeOR

Summary

The audit review of PR #1453 detected a defect in Iceberg sort-key expression generation.

When iceberg_partition_timezone is set and a temporal Iceberg transform is used in sort metadata (day, month, year, hour), timezone is appended as a raw token instead of a SQL string literal in getSortingKeyDescriptionFromMetadata().

Simplified example of generated expression:
toRelativeDayNum(column, UTC)

Expected:
toRelativeDayNum(column, 'UTC')

Without quoting, the timezone token is interpreted by the SQL parser as an identifier instead of a string literal, which can break KeyDescription::parse(...).

Affected area

Impact

Medium (correctness/reliability in planning path):

  • Queries reading Iceberg tables with transformed sort-order metadata may fail during sort-key parsing/planning when timezone override is enabled.
  • Affects read/planning path for Iceberg metadata consumers using transformed sort keys.

Code evidence

full_argument = clickhouse_transform_name->transform_name + "(";
if (clickhouse_transform_name->argument)
{
    full_argument += std::to_string(*clickhouse_transform_name->argument) +  ", ";
}
full_argument += column_name;
if (clickhouse_transform_name->time_zone)
    full_argument += ", " + *clickhouse_transform_name->time_zone;
full_argument += ")";
order_by_str.pop_back();
return KeyDescription::parse(order_by_str, column_description, local_context, true);

Reproduction sketch

  1. Configure iceberg_partition_timezone='UTC'.
  2. Use/read an Iceberg table with temporal transformed sort metadata (e.g. day(ts), month(ts), etc.).
  3. Execute a query that triggers Iceberg sort-key planning.
  4. Observe parse/planning failure due to unquoted timezone argument in generated expression.

Expected behavior

Timezone argument should be serialized as a SQL string literal (properly quoted/escaped), or generated via AST without raw string concatenation.

Suggested fix direction

  • Quote/escape timezone before appending to full_argument, or
  • Build the expression as AST instead of SQL string concatenation.

Suggested regression test

Add a test for Iceberg transformed sort-order + non-empty iceberg_partition_timezone and assert sort-key description parsing/planning succeeds.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions