Botond Dénes
Botond Dénes
> No. User tables are not enabled for RBNO yet. Ok, then my analysis in https://github.com/scylladb/scylladb/issues/22941#issuecomment-2677659830 is probably bogus. I guess the core will tell us the truth.
Looking at the core. Fortunately I wrote a script when analyzing https://github.com/scylladb/scylladb/issues/22244, to calculate the repair memory consumption. Unfortunately I lost the script and I have to rewrite it from...
> > Looking at the core. Fortunately I wrote a script when analyzing [#22244](https://github.com/scylladb/scylladb/issues/22244), to calculate the repair memory consumption. Unfortunately I lost the script and I have to rewrite...
``` (gdb) scylla repairs -m Repairs for which this node is leader: (repair_meta*) 0x60503ab7f7b0: id=19197, table=large_collection_test.table_with_large_collection, reason=2, row_buf={len=0, memory=0}, working_row_buf={len=30, memory=48208512}, same_shard=True, tablet=False host: 496e8b0c-50bf-4ada-b8f9-3d167138e908, shard: 5, state: repair_state::get_combined_row_hash_finished host:...
I made sure to save the script this time: https://github.com/scylladb/scylladb/pull/23075.
``` (gdb) scylla memory Used memory: 9198370816 Free memory: 56360960 Total memory: 9254731776 LSA: allocated: 8457437184 used: 8457306112 free: 131072 Cache: total: 6379798528 used: 5424832960 free: 954965568 Memtables: total: 2077638656...
Reclaiming in the cache does seem to be enabled: ``` (gdb) scylla databases 0 (replica::database*)0x6050039fe010 1 (replica::database*)0x6010039fe010 2 (replica::database*)0x6020039fe010 3 (replica::database*)0x6030039fe010 4 (replica::database*)0x6040039fe010 5 (replica::database*)0x6050039fe010 6 (replica::database*)0x6060039fe010 7 (replica::database*)0x6070039fe010 8...
``` (gdb) p &$dereference_smart_ptr($4->_impl) $5 = (logalloc::tracker::impl *) 0x6050000ddcc0 (gdb) p $5->_reclaiming_disabled_depth $6 = 0 ``` Reclaiming seems to be enabled at the logalloc tracker level too. I don't understand,...
Ah, I see reclaim has built-in failure: `failed_reclaims_allowed`. After 10 attempts to reclaim a segment, we give up.
This reclaim code is mind bendingly complex. After reading more, I again think it should have succeeded evicting from cache.