neon
neon copied to clipboard
Epic: s3 in pageserver stage 2
This is a new epic covering s3 enhancements in pageserver. Replaces #977
Divided by priority:
Must have:
- [x] ensure observability of S3 integration (#1798)
- [x] create dashboard for s3 metrics (probably per pageserver, and global one)
- [x] #1579
- [x] #999
- [x] #1887
- [ ] #1559
- [ ] #1607
Nice to have:
- [ ] #1557
- [ ] introduce separate delete endpoint in pageserver so we delete files from s3
- [ ] invalidate remote index cache on
detach(or other command?) to allow remote json file updates without pageserver restart - [ ] #987 postponed until supported by the SDK. We plan to have support for page level checksums as a partial protection from corruption #1185
- [ ] In pageserver recovery we may push the whole timeline dir as a checkpoint not only the necessary data. See https://github.com/zenithdb/zenith/blob/adb0b3dadaf5a4f099e13adeaf8291c2d4b94550/pageserver/src/remote_storage/storage_sync.rs#L725. Also this data may contain garbage as we do not have checksums right now, and we probably should avoid checking every file in timeline dir on startup. Only files that were created after disk consistent lsn. Or we should just remove possibly incorrect files and rebuild the delta from wal
- [ ] #1558
- [x] ~~check available space in attach call, by calculating total data size based on metadata json~~ not really needed because of planned on demand implementation
@LizardWizzard can you please re-order subtask in (todo, postponed, etc groups)
@LizardWizzard , can we close the epic (because there are no open major items), and create a new epic for leftovers?
@vadim2404 See #2784