How to Tune Linux Thread Operations for Optimal ClickHouse Performance

Shiv IyerShiv Iyer
2 min read

Tuning Linux thread operations is critical for optimizing ClickHouse performance. Here are detailed steps and best practices to achieve this:

1. CPU and Scheduler Settings

CPU Affinity

  • Bind Threads to Specific CPUs: Use taskset or cset to bind ClickHouse threads to specific CPUs to avoid context switching and CPU migration overhead.

      taskset -c 0-15 /path/to/clickhouse-server
    

Scheduler Tuning

  • Real-Time Scheduling: Assign real-time scheduling policies to ClickHouse processes to prioritize them over other processes.

      chrt -r -p 99 $(pgrep clickhouse-server)
    
    • Scheduler Policy: Use chrt to set the real-time scheduling policy to Round-Robin (RR) or First-In-First-Out (FIFO).

Isolating CPUs

  • Isolate CPUs for ClickHouse: Use the isolcpus kernel parameter to isolate specific CPUs for ClickHouse.

      GRUB_CMDLINE_LINUX_DEFAULT="isolcpus=2,3 nohz_full=2,3 rcu_nocbs=2,3"
    

2. Memory and Huge Pages

Memory Allocation

  • Allocate Sufficient Memory: Ensure that ClickHouse has enough memory to avoid swapping. Monitor memory usage and adjust max_memory_usage in config.xml.

      <max_memory_usage>32G</max_memory_usage>
    

Transparent Huge Pages (THP)

  • Disable Transparent Huge Pages: THP can cause latency spikes. Disable it for consistent performance.

      echo never > /sys/kernel/mm/transparent_hugepage/enabled
      echo never > /sys/kernel/mm/transparent_hugepage/defrag
    

3. Network Optimization

TCP Settings

  • Tune TCP Parameters: Adjust TCP settings to optimize network performance for ClickHouse.

      sysctl -w net.ipv4.tcp_fin_timeout=15
      sysctl -w net.ipv4.tcp_keepalive_time=300
      sysctl -w net.ipv4.tcp_keepalive_intvl=30
      sysctl -w net.ipv4.tcp_keepalive_probes=5
      sysctl -w net.core.netdev_max_backlog=250000
      sysctl -w net.core.somaxconn=65535
    

4. Disk I/O Optimization

I/O Scheduler

  • Set I/O Scheduler: Use noop or deadline I/O scheduler to reduce I/O latency.

      echo noop > /sys/block/sda/queue/scheduler
    

Asynchronous I/O

  • Enable Asynchronous I/O: Configure ClickHouse to use asynchronous I/O (AIO) for better disk performance.

      <min_bytes_to_use_direct_io>10485760</min_bytes_to_use_direct_io>
    

5. File System and Storage

File System Choice

  • Use XFS or ext4: XFS is generally recommended for ClickHouse due to its scalability and performance characteristics.

      mkfs.xfs /dev/sdX
    

Mount Options

  • Optimize Mount Options: Use appropriate mount options to improve performance.

      mount -o noatime,nodiratime,nobarrier /dev/sdX /data
    

6. Kernel Parameters

Kernel Tuning

  • Adjust Kernel Parameters: Optimize kernel parameters for better thread and process management.

      sysctl -w kernel.sched_min_granularity_ns=10000000
      sysctl -w kernel.sched_wakeup_granularity_ns=15000000
      sysctl -w vm.swappiness=10
      sysctl -w vm.dirty_ratio=15
      sysctl -w vm.dirty_background_ratio=5
    

7. ClickHouse-Specific Settings

Thread Configuration

  • Configure Thread Pool: Adjust ClickHouse thread pool settings to match the hardware capabilities.

      <max_threads>64</max_threads>
      <max_insert_threads>16</max_insert_threads>
    

Conclusion

By tuning these Linux and ClickHouse-specific settings, you can significantly improve the thread performance and overall efficiency of your ClickHouse deployment. Regularly monitor performance metrics and adjust configurations as needed to ensure optimal performance under varying workloads.

0
Subscribe to my newsletter

Read articles from Shiv Iyer directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Shiv Iyer
Shiv Iyer

Over two decades of experience as a Database Architect and Database Engineer with core expertize in Database Systems Architecture/Internals, Performance Engineering, Scalability, Distributed Database Systems, SQL Tuning, Index Optimization, Cloud Database Infrastructure Optimization, Disk I/O Optimization, Data Migration and Database Security. I am the founder CEO of MinervaDB Inc. and ChistaDATA Inc.