neon
neon copied to clipboard
walingest: prepare for exhaustive match and handling of no-ops
Use rust type system to force the "special treatment" block to declare the expected outcome of special treatment.
The TODO comments will be addressed in future commits, for now let's just encode our expectations.
This commit doesn't compile yet because a bunch of else { ... }
blocks are missing. I'll ask Postgres-savvy colleagues to fill
in here.
refs https://github.com/neondatabase/neon/issues/5962
full report
Failed to create a summary for the test run:
TypeError: parentSuite.children is not iterable
at parseReportJson (/__w/neon/neon/scripts/comment-test-report.js:60:41)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
at async module.exports (/__w/neon/neon/scripts/comment-test-report.js:242:28)
at async eval (eval at callAsyncFunction (/__w/_actions/actions/github-script/v6/dist/index.js:15143:16), <anonymous>:14:1)
at async main (/__w/_actions/actions/github-script/v6/dist/index.js:15236:20)
To reproduce and debug the error locally run:
scripts/comment-test-report.js https://neon-github-public-dev.s3.amazonaws.com/reports/pr-6002/7412733237/data/suites.json
Test coverage report is not available
4a7ea3f533ca0739817d526cbf6d157d15e44bfa at 2024-01-04T23:59:41.858Z :recycle:
I cannot say I like this even if it's already approved, the direction I was nudging towards with the triplet matching felt much better. Feels like there must be a more reliable way to express this, otherwise one could express that an arm does changed, but returns some Noop, what would we do then?
I filled in the blanks of records that we know are no-ops. I didn't try to compile it, but I believe the finishing this now won't require any particular Postgres knowledge.
but I believe the finishing this now won't require any particular Postgres knowledge.
Thanks, that's exactly what I wanted to get from you here :)
@hlinnaka sorry for the long turnaround.
Your changes didn't compile.
I pushed a bunch of fixups, but, in some places I think postgres-specific knowledge or at least very specific knowledge of our ingest code is required.
Please push more fixups into this branch to make it compile. No force pushes, no rebases, no merges from main
please.
Please push more fixups into this branch to make it compile. No force pushes, no rebases, no merges from main please.
Ok, done
This still needs more work.
While reviewing this PR, I took a closer look at the ingest_heapam_record
.
Product of that is https://github.com/neondatabase/neon/pull/6271 , would like to get it merged and then resume work on this PR on top of it.
Pushed two [PRE-MERGE]
commits that show where this PR would be going on top of #6271
I can't say I like it :(
I think we have to separate the two goals:
- exhaustive matching of wal records
- I don't really feel qualified for this, since I don't know enough Postgres.
- I still believe we should do this: if we don't know a record rmgr or id, don't ingest it.
- This is an orthogonal feat to the next point, though
- handling of no-op records
- the problem with no-op records is described in https://github.com/neondatabase/neon/issues/5962
- My goal when I started this PR was to add a counter for no-op records
- Since starting the PR, we added filtered WAL ingest for sharding (#6024)
- ... which uses
modification.len() > prev_len
at the end ofingest_record
to check whether the call made any modifications- I'm not sure this is 100% correct, e.g., we could be overwriting a key in the modification. But, it's a good enough detector, esp for the case where we ingest one record at a time
- That PR also added metrics to count total processed vs skipped records during ingest.
- => I don't need this PR anymore to determine how big of a problem #5962 is which I want to know to safely ship #5479
So, I'm inclined to strip this PR down to just the pieces that move us towards exhaustive matching of wal records.
Thoughts?
exhaustive matching of wal records
- I don't really feel qualified for this, since I don't know enough Postgres.
- I still believe we should do this: if we don't know a record rmgr or id, don't ingest
Pushed the stripped down version as a separate PR to https://github.com/neondatabase/neon/pull/6276
Moved out of storage team tasks board (no problem to keep a draft refactor PR around, but we don't need to track it there).
Abandoned in favor of https://github.com/neondatabase/neon/pull/6276