docs-v2
docs-v2 copied to clipboard
Add best practice for custom partition in tag bucket
Describe the issue here.
This request applies to both Cloud Dedicated and Clustered, but I will just to keep the information here for Cloud Dedicated to reduce duplicated words.
I'd like to add the following best practice for tag bucket definition:
Content
Best practice for number of tag buckets: While InfluxDB supports up to 1,000 tag buckets, the higher the number of tag buckets you configure, the worse your query performance may become, depending on how you define custom partition templates for a database or table.
For example: if a table's custom partition templates are defined as follows:
influxctl table create \
--template-timeformat '%Y-%m-%d' \
--template-tag-bucket customerID,1000 \
DATABASE_NAME \
example-table
The values of customerID
tag are bucketed into 1,000 distinct "buckets" each day, which will generate 1,000 partitions per day for this table. Therefore, one month of data will generate 30k partitions (1,000 buckets/day * 30 days/month). If the retention
policy of this database is one month or longer, such a high number of partitions will significantly show down query performance. We recommend keeping the total number of partitions around 1,000 or below for this table by:
- Reducing the number of tag buckets.
- Setting the time part to a longer duration, e.g.
'%Y-%m'
. - Shortening the retention policy (i.e. keeping the data for a shorter period).
Relevant URLs
Add the information to this page under "Tag bucket part templates": https://docs.influxdata.com/influxdb/cloud-dedicated/admin/custom-partitions/partition-templates/#tag-bucket-part-templates
Or "Partitioning best practices" page: https://docs.influxdata.com/influxdb/cloud-dedicated/admin/custom-partitions/best-practices/
Whatever you think is the best!