Tigris S3 Storage + qryn
Meet Tigris
Tigris is a new distributed S3 compatible object storage operated by Fly.io and offering global bucket replication with low pricing and a generous free tier:
5GB of data storage per month
10,000 PUT, COPY, POST, LIST requests per month
100,000 GET, SELECT and all other requests per month
Example
Let's say you have a bucket with 100GB of data and you make 1,000,000 GET requests to the objects in the bucket. You would be charged as follows:
Data Storage: 5GB x $0 + 95GB x $0.02/GB/month = $1.90
PUT Requests: 10,000 x $0 + 90,000 x $0.005/1000 requests = $0.45
GET Requests: 100,000 x $0 + 900,000 x $0.0005/1000 requests = $0.45
Data Transfer: $0
There’s more! Storage costs are calculated using GB/month, determined by averaging the daily peak storage over a monthly period. For example:
Storing 1 GB constantly for a whole month = 1 GB/month
Storing 10 GB for 12 days + 20 GB for 18 days = 16 GB/month
🚀 Sounds interesting? Get ready! This example shows how to use Tigris buckets as cold storage disk with the ClickHouse S3 Table engine and qryn. Let’s do this.
Setup Instructions
Get Tigris
- Sign in to your Fly.io/Tigris account and create an new bucket, ie:
https://yourbucket.fly.storage.tigris.dev
- Generate a token pair with write permissions to the bucket, ie:
Access Key ID = XXXXXXXXXXXXXXXXXXXXXXXX
Secret Access Key = YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY
ClickHouse
Before we proceed, let’s validate our bucket and practice some simple queries.
Configure an S3 table in ClickHouse using Parquet format
Configure the S3 Engine with your Tigris bucket and tokens
Configure max_threads, max_insert_threads based on your CPU cores
CREATE TABLE s3_tigris (name String, value UInt32)
ENGINE=S3('https://yourbucket.fly.storage.tigris.dev/somefolder/sometable.csv', 'XXXXXXXXXXXXXXXXXXXXXXXX', 'YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY', 'Parquet')
SETTINGS max_threads=8, max_insert_threads=8, input_format_parallel_parsing=0, input_format_with_names_use_header=0;
INSERT
&SELECT
data using the Tigris storage table
INSERT INTO s3_tigris VALUES ('one', 1), ('two', 2), ('three', 3);
SELECT * FROM s3_tigris LIMIT 2;
Alrigh! If everything works as expected, we’re ready to steam right ahead.
Tigris Storage for qryn
Manual queries are fun - next let's configure Tigris as a ClickHouse storage disk for our qryn instance to store our Logs, Metrics, Traces and Profiling data.
Here’s an overly simple configuration using S3 as the only storage for our data.
Configure an S3 disk with data_cache_enabled
Configure a storage policy to automatically manage our cold storage
Configure data_cache_max_size based on your storage configuration
Configure move_factor based on the desired ratio
<yandex>
<storage_configuration>
<disks>
<tigris>
<type>s3</type>
<endpoint>https://yourbucket.fly.storage.tigris.dev/fakekey</endpoint>
<access_key_id>XXXXXXXXXXXXXXXXXXXXXXXX</access_key_id>
<secret_access_key>YYYYYYYYYYYYYYYYYYYY</secret_access_key>
<data_cache_enabled>1</data_cache_enabled>
<data_cache_max_size>8589934592</data_cache_max_size>
</tigris>
</disks>
<policies>
<external>
<volumes>
<s3>
<disk>tigris</disk>
</s3>
</volumes>
</external>
<tiered>
<move_factor>0.05</move_factor>
<volumes>
<hot>
<disk>ssd</disk>
</hot>
<s3>
<disk>tigris</disk>
<prefer_not_to_merge>true</prefer_not_to_merge>
</s3>
</volumes>
</tiered>
</policies>
</storage_configuration>
</yandex>
Note: Performance may vary based on network conditions and available resources
🗨️ If you have feedback or use Tigris Buckets with ClickHouse and qryn, please consider sharing your test results with our community!
Reference Links
Interested in this subject? Check out the following links for further information
https://clickhouse.com/docs/en/engines/table-engines/integrations/s3/
https://altinity.com/blog/tips-for-high-performance-clickhouse-clusters-with-s3-object-storage
Subscribe to my newsletter
Read articles from Alex Maitland directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by