forked from ClickHouse/ClickHouse
-
Notifications
You must be signed in to change notification settings - Fork 16
Open
Labels
Description
ClusterDiscovery segfault (causes cascade failures)
After the node failure tests restart ClickHouse, clickhouse1 crashes on startup with a segfault in ClusterDiscovery::getNodeNames (ClusterDiscovery.cpp:302) — a null shared_ptr<ZooKeeper> dereference.
The code that crashes is from PR #1414, not from #1390 — PR #1390 does not modify ClusterDiscovery.cpp. This bug wasn’t caught during #1414 verification because the node failure tests couldn’t run back then (the object_storage_cluster setting that #1390 adds was missing).
Trace:
2026.03.05 14:12:40.736043 [ 21401 ] {} <Fatal> BaseDaemon: ########## Short fault info ############
2026.03.05 14:12:40.736056 [ 21401 ] {} <Fatal> BaseDaemon: (version 26.1.3.20001.altinityantalya, build id: 4AF3CE21FCDA3C93566C5D78D4F05C303ED0F81E, git hash: 72cad568a405344cbc3ad21e9de26bf5aec976e3, architecture: x86_64) (from thread 22078) Received signal 11
2026.03.05 14:12:40.736059 [ 21401 ] {} <Fatal> BaseDaemon: Signal description: Segmentation fault
2026.03.05 14:12:40.736061 [ 21401 ] {} <Fatal> BaseDaemon: Address: 0xffffffffffffffe8. Access: read. Address not mapped to object.
2026.03.05 14:12:40.736065 [ 21401 ] {} <Fatal> BaseDaemon: Stack trace: 0x000058027572ef79 0x0000580275736b16 0x0000580275733a8e 0x000058027573dc85 0x0000580275740658 0x000058026fc060f5 0x000058026fc0b97b 0x00007d11c0133ac3 0x00007d11c01c5850
2026.03.05 14:12:40.736067 [ 21401 ] {} <Fatal> BaseDaemon: ########################################
2026.03.05 14:12:40.736070 [ 21401 ] {} <Fatal> BaseDaemon: (version 26.1.3.20001.altinityantalya, build id: 4AF3CE21FCDA3C93566C5D78D4F05C303ED0F81E, git hash: 72cad568a405344cbc3ad21e9de26bf5aec976e3) (from thread 22078) (no query) Received signal Segmentation fault (11)
2026.03.05 14:12:40.736072 [ 21401 ] {} <Fatal> BaseDaemon: Address: 0xffffffffffffffe8. Access: read. Address not mapped to object.
2026.03.05 14:12:40.736073 [ 21401 ] {} <Fatal> BaseDaemon: Stack trace: 0x000058027572ef79 0x0000580275736b16 0x0000580275733a8e 0x000058027573dc85 0x0000580275740658 0x000058026fc060f5 0x000058026fc0b97b 0x00007d11c0133ac3 0x00007d11c01c5850
2026.03.05 14:12:40.775878 [ 21401 ] {} <Fatal> BaseDaemon: 3.0. inlined from ./contrib/llvm-project/libcxx/include/__memory/shared_ptr.h:476: shared_ptr
2026.03.05 14:12:40.775900 [ 21401 ] {} <Fatal> BaseDaemon: 3. ./ci/tmp/build/./src/Interpreters/ClusterDiscovery.cpp:295: DB::ClusterDiscovery::getNodeNames(std::shared_ptr<zkutil::ZooKeeper>&, String const&, String const&, int*, bool, unsigned long) @ 0x00000000191e2f79
2026.03.05 14:12:40.810625 [ 21401 ] {} <Fatal> BaseDaemon: 4. ./ci/tmp/build/./src/Interpreters/ClusterDiscovery.cpp:432: DB::ClusterDiscovery::upsertCluster(DB::ClusterDiscovery::ClusterInfo&)::$_0::operator()() const @ 0x00000000191eab16
2026.03.05 14:12:40.844554 [ 21401 ] {} <Fatal> BaseDaemon: 5. ./ci/tmp/build/./src/Interpreters/ClusterDiscovery.cpp:465: DB::ClusterDiscovery::upsertCluster(DB::ClusterDiscovery::ClusterInfo&) @ 0x00000000191e7a8e
2026.03.05 14:12:40.883035 [ 21401 ] {} <Fatal> BaseDaemon: 6. ./ci/tmp/build/./src/Interpreters/ClusterDiscovery.cpp:740: DB::ClusterDiscovery::runMainThread(std::function<void ()>) @ 0x00000000191f1c85
2026.03.05 14:12:40.925740 [ 21401 ] {} <Fatal> BaseDaemon: 7.0. inlined from ./ci/tmp/build/./src/Interpreters/ClusterDiscovery.cpp:660: operator()
2026.03.05 14:12:40.925769 [ 21401 ] {} <Fatal> BaseDaemon: 7.1. inlined from ./contrib/llvm-project/libcxx/include/__type_traits/invoke.h:87: std::__invoke_result_impl<void, DB::ClusterDiscovery::start()::$_0&>::type std::__invoke[abi:ne210105]<DB::ClusterDiscovery::start()::$_0&>(DB::ClusterDiscovery::start()::$_0&)
2026.03.05 14:12:40.925777 [ 21401 ] {} <Fatal> BaseDaemon: 7.2. inlined from ./contrib/llvm-project/libcxx/include/tuple:1380: decltype(auto) std::__apply_tuple_impl[abi:ne210105]<DB::ClusterDiscovery::start()::$_0&, std::tuple<>&>(DB::ClusterDiscovery::start()::$_0&, std::tuple<>&, std::__tuple_indices<...>)
2026.03.05 14:12:40.925781 [ 21401 ] {} <Fatal> BaseDaemon: 7.3. inlined from ./contrib/llvm-project/libcxx/include/tuple:1384: decltype(auto) std::apply[abi:ne210105]<DB::ClusterDiscovery::start()::$_0&, std::tuple<>&>(DB::ClusterDiscovery::start()::$_0&, std::tuple<>&)
2026.03.05 14:12:40.925783 [ 21401 ] {} <Fatal> BaseDaemon: 7.4. inlined from ./src/Common/ThreadPool.h:312: operator()
2026.03.05 14:12:40.925790 [ 21401 ] {} <Fatal> BaseDaemon: 7.5. inlined from ./contrib/llvm-project/libcxx/include/__type_traits/invoke.h:87: std::__invoke_result_impl<void, ThreadFromGlobalPoolImpl<true, true>::ThreadFromGlobalPoolImpl<DB::ClusterDiscovery::start()::$_0>(DB::ClusterDiscovery::start()::$_0&&)::'lambda'()&>::type std::__invoke[abi:ne210105]<ThreadFromGlobalPoolImpl<true, true>::ThreadFromGlobalPoolImpl<DB::ClusterDiscovery::start()::$_0>(DB::ClusterDiscovery::start()::$_0&&)::'lambda'()&>(ThreadFromGlobalPoolImpl<true, true>::ThreadFromGlobalPoolImpl<DB::ClusterDiscovery::start()::$_0>(DB::ClusterDiscovery::start()::$_0&&)::'lambda'()&)
2026.03.05 14:12:40.925796 [ 21401 ] {} <Fatal> BaseDaemon: 7.6. inlined from ./contrib/llvm-project/libcxx/include/__type_traits/invoke.h:342: void std::__invoke_void_return_wrapper<void, true>::__call[abi:ne210105]<ThreadFromGlobalPoolImpl<true, true>::ThreadFromGlobalPoolImpl<DB::ClusterDiscovery::start()::$_0>(DB::ClusterDiscovery::start()::$_0&&)::'lambda'()&>(ThreadFromGlobalPoolImpl<true, true>::ThreadFromGlobalPoolImpl<DB::ClusterDiscovery::start()::$_0>(DB::ClusterDiscovery::start()::$_0&&)::'lambda'()&)
2026.03.05 14:12:40.925800 [ 21401 ] {} <Fatal> BaseDaemon: 7.7. inlined from ./contrib/llvm-project/libcxx/include/__type_traits/invoke.h:348: DB::ClusterDiscovery::start()::$_0 std::__invoke_r[abi:ne210105]<void, ThreadFromGlobalPoolImpl<true, true>::ThreadFromGlobalPoolImpl<DB::ClusterDiscovery::start()::$_0>(DB::ClusterDiscovery::start()::$_0&&)::'lambda'()&>()
2026.03.05 14:12:40.925802 [ 21401 ] {} <Fatal> BaseDaemon: 7. ./contrib/llvm-project/libcxx/include/__functional/function.h:450: ? @ 0x00000000191f4658
2026.03.05 14:12:40.934183 [ 21401 ] {} <Fatal> BaseDaemon: 8.0. inlined from ./contrib/llvm-project/libcxx/include/__functional/function.h:508: ?
2026.03.05 14:12:40.934202 [ 21401 ] {} <Fatal> BaseDaemon: 8.1. inlined from ./contrib/llvm-project/libcxx/include/__functional/function.h:772: ?
2026.03.05 14:12:40.934205 [ 21401 ] {} <Fatal> BaseDaemon: 8. ./ci/tmp/build/./src/Common/ThreadPool.cpp:811: ThreadPoolImpl<std::thread>::ThreadFromThreadPool::worker() @ 0x00000000136ba0f5
2026.03.05 14:12:40.947716 [ 21401 ] {} <Fatal> BaseDaemon: 9.0. inlined from ./contrib/llvm-project/libcxx/include/__type_traits/invoke.h:0: std::__invoke_result_impl<void, void (ThreadPoolImpl<std::thread>::ThreadFromThreadPool::*)(), ThreadPoolImpl<std::thread>::ThreadFromThreadPool*>::type std::__invoke[abi:ne210105]<void (ThreadPoolImpl<std::thread>::ThreadFromThreadPool::*)(), ThreadPoolImpl<std::thread>::ThreadFromThreadPool*>(void (ThreadPoolImpl<std::thread>::ThreadFromThreadPool::*&&)(), ThreadPoolImpl<std::thread>::ThreadFromThreadPool*&&)
2026.03.05 14:12:40.947739 [ 21401 ] {} <Fatal> BaseDaemon: 9.1. inlined from ./contrib/llvm-project/libcxx/include/__thread/thread.h:159: void std::__thread_execute[abi:ne210105]<std::unique_ptr<std::__thread_struct, std::default_delete<std::__thread_struct>>, void (ThreadPoolImpl<std::thread>::ThreadFromThreadPool::*)(), ThreadPoolImpl<std::thread>::ThreadFromThreadPool*, 2ul>(std::tuple<std::unique_ptr<std::__thread_struct, std::default_delete<std::__thread_struct>>, void (ThreadPoolImpl<std::thread>::ThreadFromThreadPool::*)(), ThreadPoolImpl<std::thread>::ThreadFromThreadPool*>&, std::__tuple_indices<2ul>)
2026.03.05 14:12:40.947742 [ 21401 ] {} <Fatal> BaseDaemon: 9. ./contrib/llvm-project/libcxx/include/__thread/thread.h:168: void* std::__thread_proxy[abi:ne210105]<std::tuple<std::unique_ptr<std::__thread_struct, std::default_delete<std::__thread_struct>>, void (ThreadPoolImpl<std::thread>::ThreadFromThreadPool::*)(), ThreadPoolImpl<std::thread>::ThreadFromThreadPool*>>(void*) @ 0x00000000136bf97b
2026.03.05 14:12:40.947771 [ 21401 ] {} <Fatal> BaseDaemon: 10. ? @ 0x0000000000094ac3
2026.03.05 14:12:40.947777 [ 21401 ] {} <Fatal> BaseDaemon: 11. ? @ 0x0000000000126850
2026.03.05 14:12:40.947781 [ 21401 ] {} <Fatal> BaseDaemon: Integrity check of the executable skipped because the reference checksum could not be read.Reactions are currently unavailable