Allow to read iceberg table data from any location#1461
Allow to read iceberg table data from any location#1461zvonand merged 8 commits intoantalya-26.1from
Conversation
…ckport/antalya-26.1/90740
66845e0 to
c424634
Compare
|
AI audit note: This review comment was generated by AI ( Audit update for PR #1461 (Iceberg external location support): Confirmed defects:
Coverage summary:
|
…ckport/antalya-26.1/90740
…ckport/antalya-26.1/90740
PR #1461 CI Verification ReportSummarySupport Iceberg tables that have files outside table location or on different storage. Changed 40 files, primarily in PR's Own Integration TestsAll 18
MasterCI Failures
All MasterCI failures are unrelated to the PR. They are either known flaky tests (verified via cidb), pre-existing on the parent commit ( Regression Tests (clickhouse-regression)All regression failures were pre-existing on the parent commit (
Manual VerificationThis PR was additionally verified manually using the ConclusionPR #1461 is verified. All 18 PR-specific integration tests passed. All CI failures are pre-existing or known flaky tests, none related to the PR changes. The PR was also manually verified using the |
Audit Report: PR #1461Possible defects (High/Medium/Low)Medium
+#if USE_AVRO
+ if (auto iceberg_info = std::dynamic_pointer_cast<IcebergDataObjectInfo>(object))
+ {
+ if (auto abs_path = iceberg_info->getAbsolutePath())
+ {
+ auto [storage_to_use, key] = resolveObjectStorageForPath(
+ table_location, *abs_path, object_storage, secondary_storages, getContext());
+ if (!key.empty())
+ {
+ iceberg_info->setResolvedStorage(storage_to_use);
+ iceberg_info->relative_path_with_metadata.relative_path = key;
+ }
+ }
+ }
...
+ if (auto abs_path = iceberg_info->getAbsolutePath())
+ {
+ auto [storage_to_use, key] = resolveObjectStorageForPath(
+ table_location, *abs_path, object_storage, secondary_storages, getContext());
+ if (!key.empty() && storage_to_use != object_storage)
+ {
+ iceberg_info->setResolvedStorage(storage_to_use);
+ iceberg_info->relative_path_with_metadata.relative_path = key;
+ }
+ }Low
+std::pair<ObjectStoragePtr, std::string> getOrCreateStorageAndKey(
+ const std::string & cache_key,
+ const std::string & key_to_use,
+ const std::string & storage_type,
+ SecondaryStorages & secondary_storages,
+ const ContextPtr & context,
+ std::function<void(Poco::Util::MapConfiguration &, const std::string &)> configure_fn)
+{
+ std::lock_guard lock(secondary_storages.mutex);
+ if (auto it = secondary_storages.storages.find(cache_key); it != secondary_storages.storages.end())
+ return {it->second, key_to_use};
+ ...
+ /// Create under lock to avoid duplicate creation and wasted work
+ ObjectStoragePtr storage = ObjectStorageFactory::instance().create(cache_key, *cfg, config_prefix, context, /*skip_access_check*/ true);
+ secondary_storages.storages.emplace(cache_key, storage);
+ return {storage, key_to_use};
+} |
Supersedes #1092, #1163, #1212
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Support Iceberg tables that have files outside table location or on different storage.
CI/CD Options
Exclude tests:
Regression jobs to run: