workerd icon indicating copy to clipboard operation
workerd copied to clipboard

Implement state-machine utility

Open jasnell opened this issue 3 weeks ago • 7 comments

Scratching a bit of a long term itch here. The bulk of the implementation here was done by claude under heavy supervision and iteration.

Throughout the code base we make use of kj::OneOf for state machines. The typical pattern is something like:

struct Open { /* ... */ };
struct Ended { /* ... */ };
kj::OneOf<Open, Ended, kj::Exception> state;

This pattern is simple and effective but suffers from a lack of guard rails. For instance, state transitions can be made out of order, or can be made while there are still dangling references. Over the years we've had numerous cases of UAF's and other issues due to invalidated references after incorrectly timed state changes, etc. The direct use of kj::OneOf is effective but can be error prone.

This PR introduces a new state-machine.h utility that thinly wraps around kj::OneOf while putting some guard rails in place to protect against state transitions occuring at the wrong time, reducing the likelihood of UAFs and other memory issues, and generally just improving the quality of the code. An example use in compression.c++ is included as a proof point.

  ComposableStateMachine<
      TerminalStates<Ended>,
      ErrorState<kj::Exception>,
      ActiveState<Open>,
      Open,
      Ended,
      kj::Exception> state;

The intention would be to first transition the code in api/streams to use this, then apply usage more generally across the code base.

The specific implementation of the tool was an exercise in using claude. It required a decent amount of back and forth and refinement. Claude definitely did not get everything right in the first, second, or even third pass so the session was highly interactive, requiring quite a bit of back and forth with the tool to generate the right behavior. Test coverage should be complete and the tests detail how the utility should be used. There are also extensive doc comments throughout. See the changes in compression.c++ for an illustration of how existing code would be migrated.

The code comments go into significant detail on how the mechanism works, and even includes a migration guide. A significant portion of the LOC changed in the PR are just comments. The test provides full coverage and plenty of examples on how the API would be used. For review, I recommend starting with the code comments and tests in order to best understand what the code is doing. Then, move on to the actual implementation detail.

Overall: this is part of the streams cleanup effort. One of the issues we've had with the streams implementation are unclear/incorrect streams state transitions leading to things like UAFs. This is part of the effort to clean up that implementation to make it safer overall.

~~It's a large PR, yes, but there's exceedingly little risk here. The only functional changes are to compression.c++, which already has good test coverage. The changes there are minimal and shouldn't introduce any risk.~~

jasnell avatar Dec 10 '25 18:12 jasnell