vespa icon indicating copy to clipboard operation
vespa copied to clipboard

Different behaviours of visitor in GET and UPDATE operations

Open danitico opened this issue 1 year ago • 2 comments

Describe the bug We have identified different behaviours in the visitor process using the same selection criteria. We are using an imported field to define the selection criteria, in this case, music.isTop == true.

To Reproduce Visit this repo to reproduce the bug https://github.com/danitico/bug-referenced-fields

Expected behavior See the repo for the expected behaviour Visit this repo to reproduce the bug https://github.com/danitico/bug-referenced-fields

Screenshots None

Environment (please complete the following information):

  • OS: macOs
  • Infrastructure: self-hosted
  • Versions 14

Vespa version 8.270.8

Additional context None

danitico avatar Dec 12 '23 12:12 danitico

Will be looked at early January 2024.

geirst avatar Dec 13 '23 12:12 geirst

Thank you for the comprehensive bug report! I was able to quickly reproduce your findings.

It is not intentional from our side that this use case does not work as expected, though the reason for why it doesn't work is a bit involved.

Root cause (short version)

Conditional updates do not support imported fields as part of their test-and-set expression. This transitively also applies to updates sent as part of batch operations.

Root cause (longer, exciting version)

From a high-level perspective, batch updates are currently implemented as two interleaved, concurrent phases:

  1. The Document V1 request handler starts a visiting session using the provided selection (in this case containing an imported field). Candidate document IDs to update are streamed back to the request handler.
  2. Concurrently, for each received document ID, send an asynchronous Update operation containing the field update operations specified by the client. To avoid race conditions with unrelated client writes to the same document, each update operation includes the provided selection verbatim as its test-and-set condition.

Including the request selection as a test-and-set predicate prevents overwriting a more recent document version if it does not match the original selection.

However, the crux of the issue observed here is that test-and-set conditions cannot refer to imported fields. Attempting to reference an imported field will always fail the update upon condition evaluation. I'll happily go into more details on why this restriction exists in the (empirically speaking, somewhat unlikely...! 👀) event that anyone's interested.

This means that although phase 1 will return the expected set of documents to update, the actual updates will all end up failing during phase 2. We failed to catch this particular interaction with imported fields during the design of the batch update functionality.

Side note: that all the updates end up failing silently is not expected—the error messages in this case would explicitly state that imported fields are not supported. This would at least have immediately surfaced the underlying problem. We will look into the error reporting semantics for batch updates.

Workarounds

I can't think of any good ways to get imported fields in selections to play nicely with the current batch update semantics.

The most pragmatic workaround is likely going to be to "manually" perform the two phases above by combining visiting with sending document updates, but to either not include a test-and-set condition in the updates sent, or to include one that does not refer to imported fields but still prevents any races from concurrent writes (using an explicit sequence number field, or similar).

This can be done using the Document V1 API, but will be less performant than using batch updates due to the increased overhead.

Follow-up on the Vespa side

I think we do want to support the combination of batch update and imported fields (despite imported fields not being supported for standalone document updates), as a pseudo-"inner join" is a powerful primitive.

I've been pondering a few different approaches, but all of them require a non-trivial amount of work. I can therefore not offer any estimates of when the feature will be available. I hope that you can find a usable workaround until then.

Please let me know if you have any questions.

vekterli avatar Jan 04 '24 15:01 vekterli