varnish-cache icon indicating copy to clipboard operation
varnish-cache copied to clipboard

Drop ban_mtx during STV_BanExport()

Open nigoroll opened this issue 2 years ago • 5 comments

This change is to drop the ban mutex while stevedores are munging on the serialized ban list.

While I would value his feedback, it is my understanding that @mbgrydeland does no longer use this code.

Description from the commit message:

Stevedores now need to be prepared to potentially receive multiple export calls in parallel.

Why is this change safe?

ban_export() needs to hold the lock while serializing the ban list into the VSB, but once that is done, it can drop the lock.

Other call sites are:

  • BAN_Commit() -> ban_info_new()

    Because our new ban b could be concurrently removed by another thread while the lock is dropped, we move the dup check to before the ban_info_new() call.

    This should also have the added benefit of potentially (not) exporting more bans as completed.

  • ban_cleantail() -> ban_info_drop()

    Each iteration of the loop starts with the last ban, and the now removed ban b has been moved to the freelist, so there is no concurrent access possible any more while the lock is dropped.

  • BAN_Compile(), BAN_Shutdown()

    Called during init / fini, no concurrency here.

Other than that, we should really batch ban_export calls, especially for ban_cleantail()...

nigoroll avatar Sep 20 '22 14:09 nigoroll

The reason that the ban mutex is being held across the STV_Ban* calls aren't to protect the ban data structures from concurrent access (which as I read it the commit text gives arguments for why it would be safe), but rather to preserve the strict FIFO ordering of bans in and out and to include those STV_Ban* calls in that ordering (which again is a key foundation for our implementation at least). It looks to me like these proposed changes will undo that guarantee.

mbgrydeland avatar Sep 21 '22 11:09 mbgrydeland

@mbgrydeland can you please explain why you think the ordering of the STV_Ban* calls would change? (FWIW, I would guess you actually care about the storage_baninfo_f ordering, do you?) I would have thought that the only effect was that the next STV_BanInfo* and, potentially, STV_BanExport() calls could come in earlier (while the ban export(s) is/are still running, potentially). True, there is also an edge case that one thread releasing the ban lock could get overtaken by another, but couldn't that easily be prevented by checking the time of the latest ban in the exported list? Am I missing something else?

nigoroll avatar Sep 21 '22 13:09 nigoroll

The important part of the contract between stevedore and varnish here is that STV_BanInfoDrop() is called in the exact same FIFO order with the same ban information as the STV_BanInfoAdd() was called previously, with STV_BanExport() being a kind of special case of reset and bulk STV_BanInfoAdd() in one.

With the PR as is the mutex is released in ban_export() while STV_BanExport() is called. If STV_BanInfoAdd() is then called (e.g. the result of any client thread calling VRT_ban_string()), the stevedore can not know any longer if that ban belongs before or after the bulk list from STV_BanExport(). Releasing the mutexes across the STV_BanInfo*() calls as well opens the same race on all the incoming and outgoing calls.

So my argument is that the reason these calls are done while under the mutex is to provide the necessary serialization of these events for the stevedores, and I advise against the patch as is.

mbgrydeland avatar Sep 21 '22 14:09 mbgrydeland

@mbgrydeland I understand that the mutex in current code enforces strict serialization.

I would think, however, that the order of bans, at least as persistence is concerned, is very well defined through the ban_time(). Thus, the ordering problem with respect to an already exported ban list which you mentioned, is still not obvious to me.

That said, I can live without this patch, I just would have thought that it could contribute to overall ban performance because, for example, handling the ban export in some other async fashion, would necessitate a memcpy().

nigoroll avatar Sep 21 '22 15:09 nigoroll

Summary of bugwash: Because there is not much documentation for it, the stevedore code defines its API. by this interpretation, this PR constitutes an API change. @mbgrydeland has no particular own interest in this change, but points out that the deprecated persistent stevedore presumable relies on this behavior. I have asked him to just speak his verdict to either merge or ditch this PR. In the former case, I would look after the the deprecated persistent stevedore also, as long as there is no replacement for it to exercise tests of the API.

nigoroll avatar Oct 10 '22 15:10 nigoroll