varnish-cache icon indicating copy to clipboard operation
varnish-cache copied to clipboard

high watermarks for workspaces iteration 2

Open nigoroll opened this issue 3 years ago • 1 comments

This is the continuation of #3285 based on the bugwash feedback.

It is submitted as a draft PR to gather feedback first before spending more time on the missing bits, which are:

  • finish a proper user reporting interface, I would suggest a new SLT_Resource for resource consumption tracking.
  • add reporting for the backend side
  • complete the tracking of used space from reservations, such that at_least does not show up for varnish-cache any more.

I am particularly interested in feedback on the changes to the ws interface: Is this the way we want to go?

Primary commit message:


We add tracking of a "high water" pointer to workspaces.

It is tracked automatically for normal allocations and is preserved across resets.

For reservations, the used space needs to be tracked explicitly. If explicit tracking is missing, the "at least" flag is raised if the reserved space exceeded the high water mark.

This could be done more intelligently, "at least" could be cleared if a following allocation exceeds a previous reservation, but the way to go is to have proper reporting.

For user reporting, a prototypical VSL is implemented as SLT_Debug, for example:

**** v1    vsl|       1001 Debug           c ws highwater client = 77684
**** v1    vsl|       1001 Debug           c ws highwater session = 120
**** v1    vsl|       1001 Debug           c ws highwater thread = 2360

With the "at least" flag present, = is replaced by >= as in this example:

*** v1    vsl|       1027 Debug           c ws highwater client >= 77761

To avoid struct ws to grow by another pointer size, the magic value is reduced to a uint16_t to make room for the flags.

nigoroll avatar Sep 01 '22 16:09 nigoroll

Some thoughts after a quick glance through the diff.

I understand this was only a draft, but even there I would have preferred a break down:

  • magic change
  • highwater api
  • panic reporting
  • api usage in tasks

I noticed a change in HSH_Purge() that looks very intentional, but is not documented. A breakdown of the change would have helped highlighting it. Remembering my review of Martin's pull request that led to this if (n_max < 2) I think we shouldn't change it. There may be a cache locality improvement, but hopefully it should be a rare case too.

For the same struct size, you could turn h into an unsigned and have room for up to 32 flags. We could get rid of the id first character uppercase overflow marker and consume 3 of those flags already. I have a 4th potential "sealed" flag for another use case I'm not actively researching (sitting somewhat deep in my backlog). That would also mean leaving the magic number unchanged.

dridi avatar Sep 02 '22 07:09 dridi