Streamstone icon indicating copy to clipboard operation
Streamstone copied to clipboard

Duplicate detection entities redundant?

Open Davidsv opened this issue 6 years ago • 9 comments

Hi, In your image of the schema you have an example of "Duplicate event detection is done by automatically creating additional entity for every event, with RowKey value set to a unique identifier of a source event".

I wanted to try to understand why this is necessary. Is is not enough that you have the event SS-SE-### entities and the SS-HEAD? Because you will get a conflict "by always including stream header entity with every write, making it impossible to append to a stream without first having a latest Etag". Additionally, you would get conflicts by trying to insert the same SS-SE-### entity twice as well. Could you help me understand why SS-UID is also needed? :)

Davidsv avatar Aug 21 '18 13:08 Davidsv

UID is not the same as SeqID. You can use UID to implement idempotency, think prevention of duplicate form submission. Imagine, you're processing the command off the queue, it something fails between the moment you processed the command and deleting the message from the queue, the command will be retried and you re-processed it again. If you command has unique ID (GUID) you can use it as event UID.

yevhen avatar Aug 22 '18 05:08 yevhen

Thanks. That is a great point. Could it be optional in case idempotency is not needed and you'd want to save some space (it adds up)?

Davidsv avatar Aug 22 '18 11:08 Davidsv

sure. Just pass EventId.None

yevhen avatar Aug 22 '18 12:08 yevhen

What about deleting old UID entities, if I understand correctly there is no value in storing them?

mhertis avatar Aug 22 '18 13:08 mhertis

Well, yes. But it may not worth the hassle

yevhen avatar Aug 22 '18 14:08 yevhen

Sure. I just tried using command id for event id, it seems that we have cases where one command raise multiple events and this does not work. :)

mhertis avatar Aug 22 '18 14:08 mhertis

@mhertis use EventInclude feature instead. That will work on a batch level

yevhen avatar Aug 22 '18 19:08 yevhen

If we use EventInclude feature, included events are not considered as events by StreamStone, e.g. version is not incremented for those events and reading from stream do not return them.

What we have found and could work is that we can set the id on the first event while we left remaining events in a batch without id.

mhertis avatar Feb 27 '20 08:02 mhertis

@mhertis for sure, they are just table entities. It's just a way how Streamstone exposes ETG. Nothing more

yevhen avatar Feb 27 '20 12:02 yevhen