tusd icon indicating copy to clipboard operation
tusd copied to clipboard

Make use of locker mandatory

Open Acconut opened this issue 1 year ago • 8 comments

Currently, a composer can be created without specifying a lock provider. Such setups should only be used for testing purposes but not for production as explained in https://tus.github.io/tusd/advanced-topics/locks/.

We should make the use of a locker mandatory to make this clear. This is also motivated by questions such as in https://github.com/tus/tusd/issues/1110, where users where unsure whether they need a locker. This change would also allow us to remove some code checking for the existence of a lock provider.

However, this must be done in a major release since this would be a breaking change.

Acconut avatar Apr 23 '24 09:04 Acconut

Why not simply add a default in case it's missing? That's what we do in tus Node.js and it wouldn't be a breaking change.

Murderlon avatar Apr 23 '24 09:04 Murderlon

Maybe I have to differentiate a bit more: For users of the tusd CLI we already add lockers by default, so for them nothing would change. For users of the tusd package, we don't set a default storage and also no default locker since they proper implementation (in-memory, file-based, distributed) depends on their use case. We could default to in-memory locks, but might work well in development and then fail in production without users understanding what is going on. I would prefer if we make our users more aware of this decision they have to make, which would be the case if we error out on a missing lock implementation. Of course, all of this must be accompanied by proper documentation.

Acconut avatar Apr 23 '24 10:04 Acconut

We could default to in-memory locks, but might work well in development and then fail in production without users understanding what is going on

I would argue the exact opposite, accidentally not using a lock is what causes failures in production without users understanding. With Go I'm much more reluctant for major versions than in JS, where you have better version tooling. So with that in mind I'd say preventing a major version at all cost and defaulting to memory locker is better.

proper implementation (in-memory, file-based, distributed) depends on their use case

If you run a complex distributed setup you are already aware concepts such as lockers need to change. With good docs indicating this, I'd say it's much less likely for an accidental memory locker to cause problems than it is for the majority of people with simpler setups to forget to add a lock if they don't upgrade.

Murderlon avatar Apr 23 '24 13:04 Murderlon

I think Go made it pretty easy to work with major versions. I would not shy away from major version bumps.

mitar avatar Apr 23 '24 20:04 mitar

There's a difference between shying away from major versions for the sake of it and considering alternatives which may prevent it without loosing out on quality. Not matter how you put it, major versions are not ideal. And the main argument was this:

If you run a complex distributed setup you are already aware concepts such as lockers need to change. With good docs indicating this, I'd say it's much less likely for an accidental memory locker to cause problems than it is for the majority of people with simpler setups to forget to add a lock if they don't upgrade.

Murderlon avatar Apr 24 '24 08:04 Murderlon

If locking is made mandatory I think a "null" lock implementation that does nothing should be provided.

Reason being that some stores might be able to implement consistency guarantees internally. E.g. Azure Blob Storage offers blob leases and since lease id needs to be passed to every blob operation after acquiring it implementing it outside of azurestore might not be the best choice.

quality-leftovers avatar May 06 '25 11:05 quality-leftovers

Reason being that some stores might be able to implement consistency guarantees internally. E.g. Azure Blob Storage offers blob leases and since lease id needs to be passed to every blob operation after acquiring it implementing it outside of azurestore might not be the best choice.

I don't think this would be sufficient. While this approach might make individual operations of the data storage (e.g. WriteChunk) safe for concurrent use, request handling often uses multiple operations that must not interfere with other request handling. For example, a PATCH request handler usually calls GetInfo followed by WriteChunk. If the upload gets deleted between these calls by a DELETE request, the PATCH handler will run into problems. A locker overcomes this problem by ensuring exclusive access to upload resources.

In addition, the locker facilities lock releases if previous requests are hanging and the client resumes the upload. Something that's hard to implement in the data store.

Acconut avatar May 06 '25 11:05 Acconut

For azure storage I believe you could put a lease on the data blob during WriteChunk just before "putting" the chunk and validate the offset does not differ from the previously known offset. If offset is unchanged then you can go ahead, otherwise the client needs to re-try.

However I just noticed that the in-memory lock wouldn't hurt in that case either. So the need for a nop-lock really ain't there.

quality-leftovers avatar May 07 '25 12:05 quality-leftovers