thanos
thanos copied to clipboard
compactor: does not compact 4 consecutive 2-hour blocks
Thanos, Prometheus and Golang version used:
Thanos: 0.32.4 Golang: go1.20.8
Prometheus: 2.45.0 goVersion: go1.20.5
Object Storage Provider:
Openstack S3 compatible
What happened:
I have a Thanos compactor with the following metrics:
thanos_compact_halted 0
thanos_compact_todo_compactions 0
It is tracking a bucket where almost all blocks have been compacted up to level-4. However, there are some level-1 blocks that are not compacted, and I was expecting them to be compacted into a level-2 block. I have made this animated gif to show it more clearly:
None of those blocks has been marked as no-compaction, so they should be compacted.
These are the meta.json
for each one of them:
01HT1G02DF2W21A1KTHDVPX0BR
{
"ulid": "01HT1G02DF2W21A1KTHDVPX0BR",
"minTime": 1711584000246,
"maxTime": 1711591200000,
"stats": {
"numSamples": 2492646,
"numSeries": 5196,
"numChunks": 20775
},
"compaction": {
"level": 1,
"sources": [
"01HT1G02DF2W21A1KTHDVPX0BR"
]
},
"version": 1,
"thanos": {
"labels": {
"cluster_name": "alpha",
"cluster_node": "prometheus004",
"datasource": "alpha-002"
},
"downsample": {
"resolution": 0
},
"source": "sidecar",
"segment_files": [
"000001"
],
"files": [
{
"rel_path": "chunks/000001",
"size_bytes": 3964613
},
{
"rel_path": "index",
"size_bytes": 646029
},
{
"rel_path": "meta.json"
}
],
"index_stats": {}
}
}
01HT1PVSMCNYF8ZSDW53123NJX
{
"ulid": "01HT1PVSMCNYF8ZSDW53123NJX",
"minTime": 1711591200246,
"maxTime": 1711598400000,
"stats": {
"numSamples": 2492640,
"numSeries": 5193,
"numChunks": 20772
},
"compaction": {
"level": 1,
"sources": [
"01HT1PVSMCNYF8ZSDW53123NJX"
]
},
"version": 1,
"thanos": {
"labels": {
"cluster_name": "alpha",
"cluster_node": "prometheus004",
"datasource": "alpha-002"
},
"downsample": {
"resolution": 0
},
"source": "sidecar",
"segment_files": [
"000001"
],
"files": [
{
"rel_path": "chunks/000001",
"size_bytes": 3957077
},
{
"rel_path": "index",
"size_bytes": 644900
},
{
"rel_path": "meta.json"
}
],
"index_stats": {}
}
}
01HT1XQGXB5CHQB21YT5DNXFC8
{
"ulid": "01HT1XQGXB5CHQB21YT5DNXFC8",
"minTime": 1711598400246,
"maxTime": 1711605600000,
"stats": {
"numSamples": 2492640,
"numSeries": 5193,
"numChunks": 20772
},
"compaction": {
"level": 1,
"sources": [
"01HT1XQGXB5CHQB21YT5DNXFC8"
]
},
"version": 1,
"thanos": {
"labels": {
"cluster_name": "alpha",
"cluster_node": "prometheus004",
"datasource": "alpha-002"
},
"downsample": {
"resolution": 0
},
"source": "sidecar",
"segment_files": [
"000001"
],
"files": [
{
"rel_path": "chunks/000001",
"size_bytes": 3969637
},
{
"rel_path": "index",
"size_bytes": 645540
},
{
"rel_path": "meta.json"
}
],
"index_stats": {}
}
}
01HT24K86QTXJ1HV2NW252DAEV
{
"ulid": "01HT24K86QTXJ1HV2NW252DAEV",
"minTime": 1711605600246,
"maxTime": 1711612800000,
"stats": {
"numSamples": 2492646,
"numSeries": 5196,
"numChunks": 20775
},
"compaction": {
"level": 1,
"sources": [
"01HT24K86QTXJ1HV2NW252DAEV"
]
},
"version": 1,
"thanos": {
"labels": {
"cluster_name": "alpha",
"cluster_node": "prometheus004",
"datasource": "alpha-002"
},
"downsample": {
"resolution": 0
},
"source": "sidecar",
"segment_files": [
"000001"
],
"files": [
{
"rel_path": "chunks/000001",
"size_bytes": 3981026
},
{
"rel_path": "index",
"size_bytes": 645293
},
{
"rel_path": "meta.json"
}
],
"index_stats": {}
}
}
This is the command line that I'm using:
/bin/thanos compact \
--bucket-web-label=cluster_node \
--data-dir /var/thanos/compact \
--objstore.config-file=/etc/thanos/objstore.yml \
--wait \
--selector.relabel-config-file=/etc/thanos/relabel_config.yml \
--downsampling.disable \
--retention.resolution-5m=1d \
--retention.resolution-1h=1d \
--log.format=json \
--log.level=debug
Contents of /etc/thanos/objstore.yml
type: S3
config:
bucket: "thanos-alpha"
endpoint: "redacted"
access_key: "redacted"
insecure: false
signature_version2: false
secret_key: "redacted"
list_objects_version: "v1"
http_config:
idle_conn_timeout: 60s
Contents of /etc/thanos/relabel_config.yml
- action: keep
regex: "alpha-002"
source_labels:
- datasource
What could be the reason for this behavior?
Is thanos_compact_iterations_total
more than 0? :thinking:
Is
thanos_compact_iterations_total
more than 0? 🤔
Yes, it is constantly growing.
This is how thanos_compact_todo_compactions
compares with thanos_compact_iterations_total
:
@vincent-olivert-riera can you show us some information about the level 4 blocks you mentioned? What's their duration?
@vincent-olivert-riera can you show us some information about the level 4 blocks you mentioned? What's their duration?
Sure.
This is its meta.json
{
"ulid": "01HT1RN6JP9AZWYGHTG8XRXHSS",
"minTime": 1710374400246,
"maxTime": 1711584000000,
"stats": {
"numSamples": 418763800,
"numSeries": 5231,
"numChunks": 3489753
},
"compaction": {
"level": 4,
"sources": [
"01HRXEDZ6XVH2MYV7K0W6CQ27Z",
"01HRXN9PEVG5DWN2DA3H1EQZYW",
"01HRXW5DPSMG8GMH0CNQZ08S79",
"01HRY314YTX75WHERFSB1XFMQ2",
"01HRY9WW6SCNHJBT3BWB46V9Y1",
"01HRYGRKETBY322F7ABNHFGMNP",
"01HRYQMAPTCKA67H1XT72KMA4E",
"01HRYYG1YSWVX80H09XBE68MW4",
"01HRZ5BS6TX5RANK4KVFE1A25K",
"01HRZC7GET4BX2PCHXV0AXF0MX",
"01HRZK37PT5KR89H78K70Z3FW2",
"01HRZSYYYSEPRXQ3J1G99EMRN5",
"01HS00TP6TF8HZ11SSBJ5DJTG2",
"01HS07PDEX23P49VE9KN0Z6B6P",
"01HS0EJ4PSQQR9HYQNKV7HP2XN",
"01HS0NDVZX0XGC0K85FKS87HEE",
"01HS0W9K6T60BN2G1HFWNZDYZC",
"01HS135AEVCERKJHJAE7YE82PG",
"01HS1A11PS6BF1W0WM0ZD1X1Y4",
"01HS1GWRYTJE5BQQDVYQ99DFMB",
"01HS1QRG6TZAZFAGBQ5DYTFV0H",
"01HS1YM7ESF2Q93M625Q9SEJYX",
"01HS25FYPTETJ90722E08R9PFS",
"01HS2CBNYT6CM6BJ9AK2M8PGN6",
"01HS2K7D6T8AW1R2VMRCPQ2272",
"01HS2T34ETMF9BH086GDZEEQ2X",
"01HS30YVPTN97BAVJ5BPP3854Y",
"01HS37TJYTHEAMQ4H9JT792RJD",
"01HS3EPA6TERX4QJTX09QD0PJJ",
"01HS3NJ1ETC27EQGAB9E36SF2N",
"01HS3WDRPT2BNNW2R2E3D6BT8X",
"01HS439FYTNNAYWCN7FM561T3Z",
"01HS4A576T2G0HS1HFRR55Z83H",
"01HS4H0YETBSTJCR1VFA2KT4HM",
"01HS4QWNPSQNT3SVHWB65TWHK0",
"01HS4YRCYVQFRZZ88FQQKRX6SV",
"01HS55M46TW584YFK9NYT2Y8J0",
"01HS5CFVEVSRBDV7MK2SY3QJE7",
"01HS5KBJPTDC3DHZ3G5DP4Y1XZ",
"01HS5T79YTS4438E2ZX4FS4T5F",
"01HS61316SK2JXN87693FRJ3D9",
"01HS67YRET9XW7TJNJ5A2QSM41",
"01HS6ETFPT53QCM7VYJZTH8QB8",
"01HS6NP6YT5FMTPNY8D5N7C9BF",
"01HS6WHY6T8PCT1GNAC3TVKEY1",
"01HS73DNET028TXMPQYVVF179Q",
"01HS7A9CPT1HTA26Q4YC1FAGHV",
"01HS7H53YT05D4QETBPC042C7C",
"01HS7R0V6TKWRR46E82709XQGQ",
"01HS7YWJETVH0E1YV7KWVV6BH8",
"01HS85R9PT52VPMPFP3B9D30YQ",
"01HS8CM0YSBNC8E3X5Z1S1QAKY",
"01HS8KFR6TT0C995731BTSSZ5C",
"01HS8TBFET873G0CX47NYV5P07",
"01HS9176PTW3XFMSYGWQKZZC6E",
"01HS982XYT1JV16HZWKEX5N696",
"01HS9EYN6TSV8J00BRNE4CD74H",
"01HS9NTCEV1WCDGHNS5PJSK0NP",
"01HS9WP3PS3B9NFFP98JRYTHJ4",
"01HSA3HTYTQCCX7DH8EPDBN4Q0",
"01HSAADJ6VB2YFJKY4RWY18ZA2",
"01HSAH99EVKJQZMBSH7PF497SG",
"01HSAR50PT0ZH3ZNE8N1VJWTXQ",
"01HSAZ0QYXE5KPBH5NS0WFYHEF",
"01HSB5WF6SPAB3TJP64V7NSME1",
"01HSBCR6EVHNNNCJBN27H8RWF2",
"01HSBKKXPTYZ5D4SH8P4KW74C9",
"01HSBTFMYW431XKWR750PXYYAQ",
"01HSC1BC7088CV86NBKTXXQ494",
"01HSC873ET7YV5PK4EV61GKGD9",
"01HSCF2TPSNKYMSTCF07FBTYHQ",
"01HSCNYHYT6SVCYBF58KTZJQ9J",
"01HSCWT96TDBGKXVZ1VR44X9DV",
"01HSD3P0ETXPZ80M8EEZ61RE8H",
"01HSDAHQPYG3XCFY91N41FR4A7",
"01HSDHDEYT44RNSS14WYNVB9VS",
"01HSDR966V0NK7E5CN8ED8RQJK",
"01HSDZ4XEVH5C45F9FZK47TN59",
"01HSE60MPT0N3CER5QERB3QBH1",
"01HSECWBYS84V009FYSB3N6B39",
"01HSEKR36TJCYV52XBSWFRDFW6",
"01HSETKTETGNGNQBZYS4MSA7EP",
"01HSF1FHPTVY7PGBHS0MHR0V4Z",
"01HSF8B8YT8PMPF2YZ7WYX6DXA",
"01HSFF706TDE9TJ45HVEJE1C5E",
"01HSFP2QETJHV0QEZ70QBVE2Z4",
"01HSFWYEPTVXBBVW872WYRQ18S",
"01HSG3T5YVSTQ8SMZBEDACHG01",
"01HSGANX6TNA9HNM3ZH3ZGRGRS",
"01HSGHHMET3487PYA2BRJP80YC",
"01HSGRDBPTQWZ1ZS64GGH6SZY5",
"01HSGZ92YT3SNZSRC0M6GH56JN",
"01HSH64T6TG6N29P8E8WACF9C3",
"01HSHD0HET2HC2HP9TWRRHFEYH",
"01HSHKW8PSF5SPN131PA2CHCYN",
"01HSHTQZYTDZJ016DDXQZ9ZXQ6",
"01HSJ1KQ6WMAABPCAD4QCZ30BP",
"01HSJ8FEESK80Z3N9Z5D19841W",
"01HSJFB5PT40EA8WMKFBWCWZ3X",
"01HSJP6WYTZE8GE46P726YJVXK",
"01HSJX2M6VXD0SF4YYKA920WY8",
"01HSK3YBESNDWHZM80MBY4E4S0",
"01HSKAT2PYPJXVRZG1NBGX88B0",
"01HSKHNSYVZA9ZB9MZAKS7G5YP",
"01HSKRHH6THAAP44ZDB80NGEFE",
"01HSKZD8ET7W92EFFRP7BDMQR0",
"01HSM68ZPXA3P18Y0DPZQJXH8N",
"01HSMD4PYSWWKE2V6DPWFQ5VWA",
"01HSMM0E6TA2EM8J40F7FR478S",
"01HSMTW5ESJ3E2X9K3F8CDQJR5",
"01HSN1QWPSWRBCKV88HVAH6CXW",
"01HSN8KKYTYBPX1ZQC9BZ4Y4HJ",
"01HSNFFB6THMDJ7X4FGYBZK8BD",
"01HSNPB2EY6CXHMPWJH46T3S43",
"01HSNX6SPV19FPNV99XT4N3BGE",
"01HSP42GYS3GZPDEEJXEVTE5H5",
"01HSPAY86SE3D504YN5357EEK2",
"01HSPHSZEVQ92QFGH66YRM0W9D",
"01HSPRNPPXG4PJJGBQEZJ0TK2E",
"01HSPZHDYVDTWSRMDQPJEHR7VA",
"01HSQ6D56YBN25SBSWJ12H7XCW",
"01HSQD8WET5WDSJE28PHV491NW",
"01HSQM4KPTTEXXA5P0JZJ8MHKS",
"01HSQV0AYTDG339RRNFRR7H7VV",
"01HSR1W26TJE71X3TGF056TP2S",
"01HSR8QSET41K89H6GW418HC6X",
"01HSRFKGPTY39Y12QYYX9RDN86",
"01HSRPF7YSE9VPF3RQTPHAW7TZ",
"01HSRXAZ6VYK26MJPCA1CSYSJS",
"01HSS46PES01ZR3HJS0NQ4XSH8",
"01HSSB2DPSPMRS3RY72KK7CEM3",
"01HSSHY4YVT7JREWXH5NNTN14P",
"01HSSRSW6V89AZKRRF317ZV2RS",
"01HSSZNKET6RPQ9NH02128GSPH",
"01HST6HAPV68TBB7GRPY9WEXGS",
"01HSTDD1YSNXRE53KBARRATVNF",
"01HSTM8S6V698S7JJ3EGK49AFH",
"01HSTV4GET9ZQW2866AX8FEQ8F",
"01HSV207PW7390V9E9J9BBZJYC",
"01HSV8VYYTTAZAAYD5M5V93NQX",
"01HSVFQP6V4HCN4WF95QWGVAN7",
"01HSVPKDETASJTW0BAAJB5VB9M",
"01HSVXF4PWVT6Q68BN4B0KXA4A",
"01HSW4AVYTN03K408NBNQ9B7QZ",
"01HSWB6K6TFQJPSTEDR4Z94KNF",
"01HSWJ2AET9Q65CZ0ZGPEC69YW",
"01HSWRY1PVR3FH3GBJN4ANA8G6",
"01HSWZSRYVX0HNXA527GK123SH",
"01HSX6NG6WB00R3GRJKE5QSRA4",
"01HSXDH7EVC955BNRS0KY1R130",
"01HSXMCYPSNVZ4SMW2MQPSY2Z8",
"01HSXV8NYTKHZ93211PK4WCK5H",
"01HSY24D6TRMPDSHG32KNWKVH8",
"01HSY904ETZ47RVJ3JK0KAFB37",
"01HSYFVVPTN80CZDQ38HT26RQJ",
"01HSYPQJYTP7E8E6SGWBHE0SPP",
"01HSYXKA6TPPMRJW7D1W8WYYNZ",
"01HSZ4F1ET71NACD56BAM6RNAP",
"01HSZBARPT7FFE4X7CA2KKTYS9",
"01HSZJ6FYT8JTW9KSMNG6YEGZ9",
"01HSZS276TA2KNDMBDXA25RCG6",
"01HSZZXYEWEGBJTTBWHHQGYYBA",
"01HT06SNPTV1CKM0NHRS32PAQR",
"01HT0DNCYS66962KGAWWSGZS9V",
"01HT0MH46TP81CRNFCCCHRF19J",
"01HT0VCVETBHMAJM9SVQSX0EM6",
"01HT128JPTHQ3Q2YVF0RTB5ER5",
"01HT1949YT230P1R17F1HSCYER"
],
"parents": [
{
"ulid": "01HS2VVDW0R0EVGNTND8E2BCTM",
"minTime": 1710374400246,
"maxTime": 1710547200000
},
{
"ulid": "01HS80MPGZPTTEY3JBQ3RKV5F3",
"minTime": 1710547200246,
"maxTime": 1710720000000
},
{
"ulid": "01HSD5E4H5NJH60BH30PKA3186",
"minTime": 1710720000246,
"maxTime": 1710892800000
},
{
"ulid": "01HSJA7FHRDBYF0XFDEYXVMY0Z",
"minTime": 1710892800246,
"maxTime": 1711065600000
},
{
"ulid": "01HSQF11TV1XEPV3ZN7WWX66HA",
"minTime": 1711065600246,
"maxTime": 1711238400000
},
{
"ulid": "01HSWKTK77ZK2EM1H2EWGWCYNS",
"minTime": 1711238400246,
"maxTime": 1711411200000
},
{
"ulid": "01HT1RKXJ3KFABZF5C1V8F7JJZ",
"minTime": 1711411200246,
"maxTime": 1711584000000
}
]
},
"version": 1,
"thanos": {
"labels": {
"cluster_name": "alpha",
"cluster_node": "prometheus003-prom-jp2v-dev",
"datasource": "alpha-002"
},
"downsample": {
"resolution": 0
},
"source": "compactor",
"segment_files": [
"000001",
"000002"
],
"files": [
{
"rel_path": "chunks/000001",
"size_bytes": 536870124
},
{
"rel_path": "chunks/000002",
"size_bytes": 125027769
},
{
"rel_path": "index",
"size_bytes": 22741614
},
{
"rel_path": "meta.json"
}
],
"index_stats": {
"series_max_size": 4800,
"chunk_max_size": 1013
}
}
}
@vincent-olivert-riera if you grep your Compactor's log with block IDs of the blocks that didn't get compacted, do you see anything that stands out? If possible, maybe increase the Compactor's log level to generate more logs (then revert it, otherwise logs might be too spammy). 🤔
@douglascamata , I haven't increased the Compactor's log level yet, but this is what the Compactor is doing (in a loop):
Apr 19, 2024 @ 20:29:37.120{"caller":"compact.go:1478","level":"info","msg":"compaction iterations done","ts":"2024-04-19T11:29:29.342094884Z"}
Apr 19, 2024 @ 20:29:37.120{"caller":"compact.go:457","level":"info","msg":"downsampling was explicitly disabled","ts":"2024-04-19T11:29:29.342370667Z"}
Apr 19, 2024 @ 20:27:42.421{"cached":346,"caller":"fetcher.go:487","component":"block.BaseFetcher","duration":"8.78195813s","duration_ms":8781,"level":"info","msg":"successfully synchronized block metadata","partial":0,"returned":346,"ts":"2024-04-19T11:26:37.988636543Z"}
Apr 19, 2024 @ 20:27:42.421{"caller":"fetcher.go:317","component":"block.BaseFetcher","concurrency":32,"level":"debug","msg":"fetching meta data","ts":"2024-04-19T11:27:29.206720687Z"}
Apr 19, 2024 @ 20:27:42.421{"cached":346,"caller":"fetcher.go:487","component":"block.BaseFetcher","duration":"7.076132358s","duration_ms":7076,"level":"info","msg":"successfully synchronized block metadata","partial":0,"returned":346,"ts":"2024-04-19T11:27:36.282764217Z"}
Apr 19, 2024 @ 20:26:36.158{"caller":"fetcher.go:317","component":"block.BaseFetcher","concurrency":32,"level":"debug","msg":"fetching meta data","ts":"2024-04-19T11:25:29.206734945Z"}
Apr 19, 2024 @ 20:26:36.158{"cached":346,"caller":"fetcher.go:487","component":"block.BaseFetcher","duration":"7.2963137s","duration_ms":7296,"level":"info","msg":"successfully synchronized block metadata","partial":0,"returned":346,"ts":"2024-04-19T11:25:36.502939791Z"}
Apr 19, 2024 @ 20:26:36.158{"caller":"fetcher.go:317","component":"block.BaseFetcher","concurrency":32,"level":"debug","msg":"fetching meta data","ts":"2024-04-19T11:26:29.20683454Z"}
Apr 19, 2024 @ 20:25:37.421{"caller":"compact.go:1414","level":"info","msg":"start sync of metas","ts":"2024-04-19T11:24:22.419242154Z"}
Apr 19, 2024 @ 20:25:37.421{"caller":"fetcher.go:317","component":"block.BaseFetcher","concurrency":32,"level":"debug","msg":"fetching meta data","ts":"2024-04-19T11:24:22.419842845Z"}
Apr 19, 2024 @ 20:25:37.421{"cached":346,"caller":"fetcher.go:487","component":"block.BaseFetcher","duration":"5.786435988s","duration_ms":5786,"level":"info","msg":"successfully synchronized block metadata","partial":0,"returned":174,"ts":"2024-04-19T11:24:28.206118667Z"}
Apr 19, 2024 @ 20:25:37.421{"caller":"compact.go:1419","level":"info","msg":"start of GC","ts":"2024-04-19T11:24:28.20786563Z"}
Apr 19, 2024 @ 20:25:37.421{"caller":"compact.go:1442","level":"info","msg":"start of compactions","ts":"2024-04-19T11:24:28.208735693Z"}
I have search for all the block IDs, but Kibana does not return anything at all.
~I will try to increase the log level and see what happens.~ The log level is debug
.