John Spray
John Spray
> What if the remote storage lists some but not all shards, e.g. because we're in the middle of Tenant::split_prepare ? I need to think about this more. The hardest...
>. The hardest case is if we list things to do GC, then a new split happens while we are GC'ing, which references layers we will delete. This is only...
> What if the remote storage lists some but not all shards, e.g. because we're in the middle of Tenant::split_prepare ? Considering the "hard case" of multiple shard splits 0000...
> About AncestorRefs: why is it not a more phyiscal refcount map, like on the absolute_key ? Sure, a bit less memory efficient but much harder to screw up. Memory...
>I would prefer we call it parent shard, so at least there is a minimal distinction from the word ancestor which we already use for Timeline. The use of "ancestor"...
Next step: - try removing materialized page cache & establish the impact -- what else needs to change?
You don't mention how the tenant is sharded. The pageserver isn't meant to handle arbitrarily large shards -- large databases should be broken up into many shards, even if you...
Regarding concurrency: the way I think about compaction throughput is "how fast does it need to be to support peak write rate?". With sharding, as well as each shard being...
I think we usually see either: - That image layer generation goes fast enough that it's done with a layer by the time it would be evicted. - That we're...
>What about we just not evict layers that were downloaded for compaction? This would lead to a large amount of wasted space: once an image layer has been generated that...