neon icon indicating copy to clipboard operation
neon copied to clipboard

walingest: prepare for exhaustive match and handling of no-ops

Open problame opened this issue 1 year ago • 9 comments

Use rust type system to force the "special treatment" block to declare the expected outcome of special treatment.

The TODO comments will be addressed in future commits, for now let's just encode our expectations.

This commit doesn't compile yet because a bunch of else { ... } blocks are missing. I'll ask Postgres-savvy colleagues to fill in here.

refs https://github.com/neondatabase/neon/issues/5962

problame avatar Nov 30 '23 16:11 problame

full report


Failed to create a summary for the test run:

TypeError: parentSuite.children is not iterable
    at parseReportJson (/__w/neon/neon/scripts/comment-test-report.js:60:41)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at async module.exports (/__w/neon/neon/scripts/comment-test-report.js:242:28)
    at async eval (eval at callAsyncFunction (/__w/_actions/actions/github-script/v6/dist/index.js:15143:16), <anonymous>:14:1)
    at async main (/__w/_actions/actions/github-script/v6/dist/index.js:15236:20)

To reproduce and debug the error locally run:

scripts/comment-test-report.js https://neon-github-public-dev.s3.amazonaws.com/reports/pr-6002/7412733237/data/suites.json

Test coverage report is not available

The comment gets automatically updated with the latest test results
4a7ea3f533ca0739817d526cbf6d157d15e44bfa at 2024-01-04T23:59:41.858Z :recycle:

github-actions[bot] avatar Nov 30 '23 17:11 github-actions[bot]

I cannot say I like this even if it's already approved, the direction I was nudging towards with the triplet matching felt much better. Feels like there must be a more reliable way to express this, otherwise one could express that an arm does changed, but returns some Noop, what would we do then?

koivunej avatar Nov 30 '23 17:11 koivunej

I filled in the blanks of records that we know are no-ops. I didn't try to compile it, but I believe the finishing this now won't require any particular Postgres knowledge.

hlinnaka avatar Nov 30 '23 18:11 hlinnaka

but I believe the finishing this now won't require any particular Postgres knowledge.

Thanks, that's exactly what I wanted to get from you here :)

problame avatar Dec 01 '23 12:12 problame

@hlinnaka sorry for the long turnaround.

Your changes didn't compile.

I pushed a bunch of fixups, but, in some places I think postgres-specific knowledge or at least very specific knowledge of our ingest code is required.

Please push more fixups into this branch to make it compile. No force pushes, no rebases, no merges from main please.

problame avatar Jan 03 '24 18:01 problame

Please push more fixups into this branch to make it compile. No force pushes, no rebases, no merges from main please.

Ok, done

hlinnaka avatar Jan 03 '24 22:01 hlinnaka

This still needs more work.

While reviewing this PR, I took a closer look at the ingest_heapam_record.

Product of that is https://github.com/neondatabase/neon/pull/6271 , would like to get it merged and then resume work on this PR on top of it.

problame avatar Jan 04 '24 15:01 problame

Pushed two [PRE-MERGE] commits that show where this PR would be going on top of #6271

I can't say I like it :(

I think we have to separate the two goals:

  • exhaustive matching of wal records
    • I don't really feel qualified for this, since I don't know enough Postgres.
    • I still believe we should do this: if we don't know a record rmgr or id, don't ingest it.
    • This is an orthogonal feat to the next point, though
  • handling of no-op records
    • the problem with no-op records is described in https://github.com/neondatabase/neon/issues/5962
    • My goal when I started this PR was to add a counter for no-op records
    • Since starting the PR, we added filtered WAL ingest for sharding (#6024)
    • ... which uses modification.len() > prev_len at the end of ingest_record to check whether the call made any modifications
      • I'm not sure this is 100% correct, e.g., we could be overwriting a key in the modification. But, it's a good enough detector, esp for the case where we ingest one record at a time
    • That PR also added metrics to count total processed vs skipped records during ingest.
    • => I don't need this PR anymore to determine how big of a problem #5962 is which I want to know to safely ship #5479

So, I'm inclined to strip this PR down to just the pieces that move us towards exhaustive matching of wal records.

Thoughts?

problame avatar Jan 04 '24 17:01 problame

  • exhaustive matching of wal records

    • I don't really feel qualified for this, since I don't know enough Postgres.
    • I still believe we should do this: if we don't know a record rmgr or id, don't ingest

Pushed the stripped down version as a separate PR to https://github.com/neondatabase/neon/pull/6276

problame avatar Jan 05 '24 10:01 problame

Moved out of storage team tasks board (no problem to keep a draft refactor PR around, but we don't need to track it there).

jcsp avatar Mar 11 '24 13:03 jcsp

Abandoned in favor of https://github.com/neondatabase/neon/pull/6276

problame avatar Jun 19 '24 14:06 problame