diff --git a/content/en/altinity-kb-setup-and-maintenance/keeper-dependent-features.md b/content/en/altinity-kb-setup-and-maintenance/keeper-dependent-features.md new file mode 100644 index 0000000000..c4a0597e11 --- /dev/null +++ b/content/en/altinity-kb-setup-and-maintenance/keeper-dependent-features.md @@ -0,0 +1,49 @@ +--- +title: "Keeper-Dependent Features in ClickHouse" +linkTitle: "Keeper-Dependent Features in ClickHouse" +weight: 100 +description: >- + Keeper-Dependent Features in ClickHouse +--- + +# Keeper-Dependent Features in ClickHouse + +This is a consolidated list of features that depend on ClickHouse Keeper (or ZooKeeper-compatible API). + +`Keeper` below means either ClickHouse Keeper or Apache ZooKeeper, depending on deployment. + +| # | Feature | How configured | Default path / example | Keeper structure (brief) | +|---|---|---|---|---| +| 1 | Replicated tables (`ReplicatedMergeTree` family) | Use `ENGINE = ReplicatedMergeTree(...)`, or omit args and rely on server defaults `default_replica_path`, `default_replica_name`. | Default path: `/clickhouse/tables/{uuid}/{shard}`; replica: `{replica}`. | `/metadata`, `columns`, `log`, `blocks`, `async_blocks`, `deduplication_hashes`, `block_numbers`, `leader_election`, `replicas//queue, parts, flags, ...`, `mutations`, `quorum`. | +| 2 | `S3Queue` | Table setting `keeper_path`; if omitted CH builds path from `s3queue_default_zookeeper_path` + DB UUID + table UUID. | Default prefix: `/clickhouse/s3queue/`. | `/metadata`, `processed`, `failed`, `processing`, `persistent_processing`, `registry`; ordered mode may also use `buckets/...` subtrees. | +| 3 | `Kafka` Keeper-offset mode (`StorageKafka2`, experimental) | Enable setting `allow_experimental_kafka_offsets_storage_in_keeper=1` and set both `kafka_keeper_path`, `kafka_replica_name`. | No default keeper path (must be set), docs example: `/clickhouse/{database}/{uuid}`. | `/topics//partitions`, `topic_partition_locks`, `replicas/`, temporary `dropped` for drop coordination. | +| 4 | Distributed DDL queue (`ON CLUSTER`) | Server settings: `distributed_ddl.path`, `distributed_ddl.replicas_path`. | Defaults: `/clickhouse/task_queue/ddl/` and `/clickhouse/task_queue/replicas/`. | Queue entries as `query-XXXXXXXXXX` nodes. Each entry has status dirs: `active/`, `finished/`, optional `synced/`, `shards//...`. Replicas liveness is tracked under `//active` (ephemeral). | +| 5 | `KeeperMap` table engine | Server config must define `keeper_map_path_prefix`; table uses engine arg `root_path`. | No built-in default (disabled if prefix is absent). Common example: `/keeper_map_tables`. | `//metadata`, `metadata/tables/`, `data/`, drop/cleanup coordination nodes under `metadata/...`. | +| 6 | Replicated databases (`ENGINE=Replicated`) | `ENGINE = Replicated(zoo_path, shard, replica)` or omit args and use DB-replicated defaults. | Default DB path: `/clickhouse/databases/{uuid}`. | `/log/query-*`, `replicas//log_ptr, digest, replica_group, ...`, `metadata/`, `counter/cnt-*`, `max_log_ptr`, `logs_to_keep`. | +| 7 | Replicated access entities (users/roles/grants/quotas/policies) | Configure `user_directories` with `...`. | No mandatory default; common example: `/clickhouse/access`. | `/uuid/` stores entity payload. Type maps: `U` (users), `R` (roles), `S` (settings profiles), `P` (row policies), `Q` (quotas), `M` (masking policies), each mapping name -> UUID. | +| 8 | Replicated SQL UDFs (`CREATE FUNCTION`) | Set server config `user_defined_zookeeper_path` (otherwise disk storage is used). | No default keeper path; common example: `/clickhouse/udf`. | Root node plus one znode per function, e.g. `function_.sql` containing CREATE FUNCTION text. | +| 9 | Named collections in Keeper | Configure `named_collections_storage.type = keeper|zookeeper` (or encrypted variants) and set `named_collections_storage.path`. | Default storage type is `local`; no default keeper path when keeper mode is selected. | Root path with one znode per collection: `/.sql` containing CREATE NAMED COLLECTION statement. | +| 10 | Workload scheduler definitions in Keeper (`CREATE WORKLOAD`, `CREATE RESOURCE`) | Set `workload_zookeeper_path` (if absent, disk `workload_path` is used). | No default keeper path; docs example: `/clickhouse/workload/definitions.sql`. | Single watched znode at the configured path, content is a serialized list of workload/resource CREATE statements. | +| 11 | Cluster discovery (experimental) | Enable `allow_experimental_cluster_discovery=1`; configure `......`. | No default path; examples use `/clickhouse/discovery/`. | Discovery root contains `shards/` ephemeral nodes with JSON payload (`address`, `shard_id`, version). | +| 12 | `BACKUP/RESTORE ... ON CLUSTER` coordination | Server config `backups.zookeeper_path`. | Default: `/clickhouse/backups`. | Operation roots like `backup-` / `restore-` with coordination subtrees: stage sync, replicated objects acquisition, file mapping, keeper-map coordination, etc. | +| 13 | `AzureQueue` | Same object-storage queue keeper model as `S3Queue`, with `keeper_path` setting. | Uses the same queue metadata path logic (`s3queue_default_zookeeper_path` prefix if no explicit path). | Same pattern as `S3Queue`: `metadata`, `processed/failed/processing`, `registry`, optional `buckets` in ordered mode. | +| 14 | `generateSerialID()` function | Server setting `series_keeper_path`. | Default: `/clickhouse/series`. | One node per series: `/`, value is current counter. | +| 15 | Experimental transactions | Configure `transaction_log.zookeeper_path` (and enable related experimental transaction settings). | Default: `/clickhouse/txn`. | `/tail_ptr` and `/log/csn-*` sequential nodes storing commit sequence and transaction IDs. | +| 16 | `Shared` database engine (Cloud) | ClickHouse Cloud managed behavior (not typically user-configured in self-managed OSS). | Internal/cloud-managed. | Shared catalog is Keeper-backed; low-level path layout is internal and not documented as a stable public contract. | + +## ON CLUSTER relation (important) + +`ON CLUSTER` itself uses Keeper-backed distributed DDL queue (row 4). +Some features above already replicate through Keeper, so `ON CLUSTER` can be redundant: + +- Replicated database DDLs (row 6). +- Replicated access entities (row 7). +- Replicated UDFs (row 8). +- Keeper-backed named collections (row 9). + +ClickHouse has dedicated settings like `ignore_on_cluster_for_replicated_*` to control this behavior. + +## Notes + +- For many rows, path names can be redirected to auxiliary Keeper clusters using `` and `cluster_name:/path` notation where supported. +- Node names shown above are the stable conceptual layout from current source tree; some minor subnodes are version-specific.