How Log Shipping and Delayed Log Shipping are implemented in ClickHouse?
Log shipping is a method ClickHouse uses to replicate data across different instances or clusters to ensure high availability, fault tolerance, and disaster recovery. Delayed log shipping extends this concept by introducing a deliberate lag in replication, providing a window for recovery from operational mishaps such as accidental data deletions or corruptions. Here's how both are implemented in ClickHouse:
Log Shipping in ClickHouse
Log shipping in ClickHouse is primarily achieved through the use of ReplicatedMergeTree table engines, which automatically handle data replication between replicas within a cluster. This process involves several key components and steps:
ZooKeeper: ClickHouse uses Apache ZooKeeper to coordinate replication, maintain a consistent state across replicas, and store metadata about the data parts and their replication status.
Table Engine: To enable log shipping, tables must be created with the
ReplicatedMergeTree
engine. This engine requires specifying a ZooKeeper path and a replica name as part of the table's creation statement.Data Parts: Data in
ReplicatedMergeTree
tables is divided into parts. Each part is a set of rows that are stored together. When data is inserted into a table, it is first written to a log (a queue of changes) in ZooKeeper.Replication: The replicas continuously watch the ZooKeeper log for new entries. When a new entry is detected, each replica fetches the data part from the log or directly from another replica, ensuring that all replicas eventually hold the same data.
CREATE TABLE replicated_table ( date Date, event_name String, event_count Int32 ) ENGINE = ReplicatedMergeTree('/path/to/zookeeper/tables/replicated_table', '{replica}') PARTITION BY toYYYYMM(date) ORDER BY (date, event_name);
Delayed Log Shipping in ClickHouse
Delayed log shipping is not natively labeled as such in ClickHouse documentation but can be implemented through the manipulation of settings that control replication timing and behaviors. Here's how you can achieve delayed log shipping:
Replica Delay Configuration: You can configure a replica to delay applying the replicated log entries by setting the
replica_delay_for_remote_commands
parameter. This setting introduces a specified delay before executing the fetched log entry, effectively delaying the replication.Manual Delay: Another approach to implementing delayed log shipping involves manually managing the replication process. This can be done by temporarily suspending the replication process or adjusting the fetching of data parts from other replicas, allowing for a controlled delay.
Use of Buffer Tables: By using ClickHouse's
Buffer
engine, you can temporarily store incoming data before it's flushed to theReplicatedMergeTree
table. This approach can mimic delayed replication by providing a buffer period during which data is not immediately replicated to other replicas.
Considerations for Delayed Log Shipping
Data Recovery Window: The delay introduced provides a window to recover from accidental data deletions or corruptions before the erroneous data is replicated to the delayed replica.
Operational Overhead: Implementing and managing delayed log shipping can introduce additional complexity in managing replication settings and monitoring the replication delay.
Performance Impact: The delayed replication might lead to temporary data inconsistency between replicas, which should be considered in applications requiring strict data consistency guarantees.
While ClickHouse's replication features natively support log shipping through the ReplicatedMergeTree
family of table engines, implementing delayed log shipping requires careful configuration and operational management to ensure it meets your data recovery and availability needs.
ClickHouse Blogs in ChisaDATA
Subscribe to my newsletter
Read articles from Shiv Iyer directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Shiv Iyer
Shiv Iyer
Over two decades of experience as a Database Architect and Database Engineer with core expertize in Database Systems Architecture/Internals, Performance Engineering, Scalability, Distributed Database Systems, SQL Tuning, Index Optimization, Cloud Database Infrastructure Optimization, Disk I/O Optimization, Data Migration and Database Security. I am the founder CEO of MinervaDB Inc. and ChistaDATA Inc.