dcache pools needlessly go to disabled, when out of space

Hey.

I've hoped that 18601a941d9471893936001ebbebb4874a6dece1 would have helped with #5352 but apparently it doesn't.

Had just the situation again that due to some unfortunate IO patterns many of my btrfs pools run out of metadata space, causing:

2025-06-27T18:17:35.798468+02:00 lcg-lrz-dc35 dcache@pool_lcg-lrz-dc35_8[4903]: 2025-06-27 18:17:35+02:00 (lcg-lrz-dc35_8) [] Pool mode changed to disabled(fetch,store,stage,p2p-client,p2p-server): Pool disabled: I/O test failed
2025-06-27T18:17:35.798790+02:00 lcg-lrz-dc35 dcache@pool_lcg-lrz-dc35_8[4903]: 2025-06-27 18:17:35+02:00 (lcg-lrz-dc35_8) [] Pool: lcg-lrz-dc35_8, fault occurred in repository: I/O test failed. Pool disabled:
2025-06-27T18:18:35.809535+02:00 lcg-lrz-dc35 dcache@pool_lcg-lrz-dc35_8[4903]: 2025-06-27 18:18:35+02:00 (lcg-lrz-dc35_8) [] Pool mode changed to disabled(fetch,store,stage,p2p-client,p2p-server): Pool disabled: I/O test failed
2025-06-27T18:18:35.810208+02:00 lcg-lrz-dc35 dcache@pool_lcg-lrz-dc35_8[4903]: 2025-06-27 18:18:35+02:00 (lcg-lrz-dc35_8) [] Pool: lcg-lrz-dc35_8, fault occurred in repository: I/O test failed. Pool disabled:
2025-06-27T18:19:35.818813+02:00 lcg-lrz-dc35 dcache@pool_lcg-lrz-dc35_8[4903]: 2025-06-27 18:19:35+02:00 (lcg-lrz-dc35_8) [] Pool mode changed to disabled(fetch,store,stage,p2p-client,p2p-server): Pool disabled: I/O test failed
2025-06-27T18:19:35.819249+02:00 lcg-lrz-dc35 dcache@pool_lcg-lrz-dc35_8[4903]: 2025-06-27 18:19:35+02:00 (lcg-lrz-dc35_8) [] Pool: lcg-lrz-dc35_8, fault occurred in repository: I/O test failed. Pool disabled:
2025-06-27T18:20:35.881910+02:00 lcg-lrz-dc35 dcache@pool_lcg-lrz-dc35_8[4903]: 2025-06-27 18:20:35+02:00 (lcg-lrz-dc35_8) [] Pool mode changed to disabled(fetch,store,stage,p2p-client,p2p-server): Pool disabled: I/O test failed
2025-06-27T18:20:35.882193+02:00 lcg-lrz-dc35 dcache@pool_lcg-lrz-dc35_8[4903]: 2025-06-27 18:20:35+02:00 (lcg-lrz-dc35_8) [] Pool: lcg-lrz-dc35_8, fault occurred in repository: I/O test failed. Pool disabled:
2025-06-27T18:21:35.890279+02:00 lcg-lrz-dc35 dcache@pool_lcg-lrz-dc35_8[4903]: 2025-06-27 18:21:35+02:00 (lcg-lrz-dc35_8) [] Pool mode changed to disabled(fetch,store,stage,p2p-client,p2p-server): Pool disabled: I/O test failed
2025-06-27T18:21:35.890417+02:00 lcg-lrz-dc35 dcache@pool_lcg-lrz-dc35_8[4903]: 2025-06-27 18:21:35+02:00 (lcg-lrz-dc35_8) [] Pool: lcg-lrz-dc35_8, fault occurred in repository: I/O test failed. Pool disabled:
2025-06-27T18:22:35.899857+02:00 lcg-lrz-dc35 dcache@pool_lcg-lrz-dc35_8[4903]: 2025-06-27 18:22:35+02:00 (lcg-lrz-dc35_8) [] Pool mode changed to disabled(fetch,store,stage,p2p-client,p2p-server): Pool disabled: I/O test failed
2025-06-27T18:22:35.900731+02:00 lcg-lrz-dc35 dcache@pool_lcg-lrz-dc35_8[4903]: 2025-06-27 18:22:35+02:00 (lcg-lrz-dc35_8) [] Pool: lcg-lrz-dc35_8, fault occurred in repository: I/O test failed. Pool disabled:
2025-06-27T18:23:35.909305+02:00 lcg-lrz-dc35 dcache@pool_lcg-lrz-dc35_8[4903]: 2025-06-27 18:23:35+02:00 (lcg-lrz-dc35_8) [] Pool mode changed to disabled(fetch,store,stage,p2p-client,p2p-server): Pool disabled: I/O test failed
2025-06-27T18:23:35.909624+02:00 lcg-lrz-dc35 dcache@pool_lcg-lrz-dc35_8[4903]: 2025-06-27 18:23:35+02:00 (lcg-lrz-dc35_8) [] Pool: lcg-lrz-dc35_8, fault occurred in repository: I/O test failed. Pool disabled:

The pools should however rather just go into readonly mode...

Unfortunately I forgot to check what btrfs actually gives back in that situation (i.e. is it ENOSPC or something else), will do so when it shows up again.

But in any case... maybe that I/O test could be improved. What exactly is it doing? I'd blindly guess it tries to write and if that fails considers the pool dead?

Wouldn't it be better if, in that case, it also tries to read something it knows is there (e.g. first few bytes of a file or the .repo-ok-file) and if that succeeds only put it into read only?

Thanks, Chris.

Jun 27 '25 16:06 calestyo

Hi @calestyo ,

The I/O test simply creates an empty file and then deletes it. You can add a custom check:

https://github.com/dCache/dcache/blob/master/docs/TheBook/src/main/markdown/cookbook-pool.md#pool-health-check

Jun 27 '25 17:06 kofemann

The I/O test simply creates an empty file and then deletes it. You can add a custom check:

Well if meta-data is full even an empty file cannot be created. That's why I said... in that case dCache could check whether reading still works and only set the pool to read-only so that data could at least still be served.

Would also help in cases when the fs goes readonly for some reason.

Jun 27 '25 17:06 calestyo

Oh and manually specifying a command does't really work here, at least not unless dCache would e.g. examine it's exit status and use e.g. 1 for disable-the-pool and 2 for set-it-read-only.

Jun 27 '25 17:06 calestyo

Ah... stupid me... it seems to do just that (though with different numbers).

Jun 27 '25 17:06 calestyo

Having thought more carefully about it, I think the current functionality of pool.check-health-command doesn’t fully help.

I mean what I'd want is an exit status that means "don't write new files to the pool, but continue to read and delete"... but 1, which is read-only, would probably also prevent deletions (which might actually cause the (full) pool to become writeable again).

I guess problems like these will become more and more common. All modern file systems are CoW (even XFS goes that road), not only btrfs.

The current definition of "Any other exit code, including failure to execute the script, will disable the pool." is also a bit unfortunate, as it doesn't allow extensions with further special exit statuses, without possibly breaking existing user scripts.

Jun 28 '25 15:06 calestyo

The reason behind the current behavior is to stop accepting new data if you know that the RAID system is doing a rebuild. We are aware that there are other careers too. Thus, we address them step by step.

Jun 30 '25 11:06 kofemann