neon
                                
                                 neon copied to clipboard
                                
                                    neon copied to clipboard
                            
                            
                            
                        pageserver: test efficiency of KeySpaces when sharding with smaller stripes
We use "keyspaces" in some places, which represent contiguous ranges of keys. In a single-sharded tenant, these contiguous ranges generally correspond to the ranges of used blocks within a relation.
With sharding, it becomes normal to see gaps within a relation, where those gaps correspond to keys that don't belong to this tenant. This is not very expensive when using large stripe size (like the default 256mb), but becomes rather inefficient when using small stripe sizes (e.g. 1MB). Consider a 1TiB database: a keyspace would end up with 250000 extents with shard_count=4 stripe_size=1mb.
Small stripes provide a more statistically uniform sharing of work, although we would probably never want to go smaller than ~1mb as this would lose spatial locality on streaming reads.
We may fix this by making keyspaces shard-aware, such that they don't consider a missing key to be a gap unless that missing key belongs to the current shard.
On second thoughts, we might not have a problem here: Timeline::collect_keyspace is generating keyspace ranges from relation start block + size, so it will generate ranges that cover all the stripes in a relation and there will be no fragmentation.
KeySpaceRandomAccum is more vulnerable to the issue, but it is only used in gc_timeline() with add_range calls that describe layer files.
So what we really need here is probably just a test that inspects keyspaces on a finely sharded tenant and checks that they didn't blow up, to protect us against possible regressions in this area.
Folding this into https://github.com/neondatabase/neon/issues/6774