infra icon indicating copy to clipboard operation
infra copied to clipboard

Refactor dirty page tracking

Open ValentaTomas opened this issue 1 month ago • 5 comments

[!NOTE] Replaces dirty page tracking with a thread-safe Tracker and updates UFFD, sandbox, and build components to use it, adding unregister support and minor API changes.

  • Block / Tracking:
    • Introduce block.Tracker (thread-safe bitset) with Has, Add, Reset, Offsets, BitSet, Clone; unit tests added.
    • Extend Slicer with BlockSize().
  • UFFD:
    • Change constructor to uffd.New(memfile, socketPath); remove TrackedSliceDevice usage.
    • MemoryBackend API: Disable(ctx) now returns *block.Tracker.
    • Implement unregister via UFFDIO_UNREGISTER; Uffd.Disable unregisters then returns Tracker from Dirty().
    • Track faults via block.Tracker with bounded concurrency; expose offsets iterator in cross-process tests.
    • Buffer Ready() channel.
  • Sandbox Integration:
    • On pause, call s.memory.Disable(ctx) and pass dirty.BitSet() to MemoryDiffCreator.
    • serveMemory updated for new UFFD constructor.
  • Build / Diff:
    • Add BlockSize() to NoDiff, localDiff, and StorageDiff.
  • Utils:
    • Add utils.TransformTo iterator helper.

Written by Cursor Bugbot for commit b6a0c3a4ba221432bc2ad32490194c45fd7c9fad. This will update automatically on new commits. Configure here.

ValentaTomas avatar Nov 03 '25 22:11 ValentaTomas

I want to do a few more end to end tests, but it should be ready to merge right after https://github.com/e2b-dev/infra/tree/uffd-refactor

ValentaTomas avatar Nov 04 '25 04:11 ValentaTomas

@codex review

ValentaTomas avatar Nov 05 '25 23:11 ValentaTomas

💡 Codex Review

https://github.com/e2b-dev/infra/blob/116062b7729f46356f334abb7c158dea50876bfd/packages/orchestrator/internal/sandbox/uffd/userfaultfd/serve.go#L109-L113

P1 Badge Handle UFFD remove events after unregister The new Disable(ctx) issues UFFDIO_UNREGISTER, which causes the kernel to emit UFFD_EVENT_REMOVE notifications on the userfaultfd. The serve loop still treats every non‑UFFD_EVENT_PAGEFAULT as an error and immediately returns ErrUnexpectedEventType. After disabling memory this will report a failure, tear down the goroutine and mark the backend as errored even though the event is expected. The loop should explicitly ignore or handle UFFD_EVENT_REMOVE (and possibly other control events) once unregistering is supported.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Isn't UFFD_EVENT_REMOVE triggered by MADV_DONTNEED/MADV_REMOVE, which are not actually part of the UFFD unregister signalization—they are part of the UFFD memory management which we are not using. Not 100% sure though.

ValentaTomas avatar Nov 05 '25 23:11 ValentaTomas

Summary

Testing

  • ⚠️ go test ./packages/orchestrator/internal/sandbox/uffd/... (fails: go: command not found)

View task →

I'll do some last end-to-end testing and then we should be able to merge.

ValentaTomas avatar Nov 10 '25 22:11 ValentaTomas