NServiceBus.Persistence.Sql
NServiceBus.Persistence.Sql copied to clipboard
Add support for attachments
Attachments are pieces of data a saga can access on demand (are not loaded when main saga data is loaded).
Attachments can be anything: a piece of data, an ID, a picture, a video.
Saga API should allow to create, modify and query attachments using a stream API. The stream consumption and production is handled via callbacks so that the stream does not leak outside of the API calls:
//Read attachment passing in a func for processing the stream.
//If no attachment is found by this ID, return null
Attachment<MyValue> v = await context.ReadAttachment<MyValue>(
"SomeID", dataStream => Deserialize<MyValue>(dataStream));
MyValue o = v.Data; //access the deserialized data
int version = v.Version; //access the optimistic concurrency version
//Serialize is Func<Task<Stream>>
await context.AddAttachment("SomeOtherID", () => Serialize(o)); //Fails of attachment already exists
await context.UpdateAttachment("SomeOtherID", version, () => Serialize(o))); //Fails is attachment does not exist or the stored version does not match the parameter value
var list = await context.ListAttachmentsIds();
Usage scenarios:
- Scatted/Gather. Each work item state is stored as a separate attachment to avoid congestion on the saga state information. The work is completed when
ListAttachmentIds()returns the expected number of attachments. - Storing large data (e.g. photos, videos) which is not used only in small part of saga logic.
@Particular/sqlpersistence-maintainers please comment on the API proposal?
i dont think the writes would be async? wouldnt they queue up in memory and write at the end of the UOW?
in terms of the column type. string, binary or both?
minor but this. wont work since it will be a context extension not a saga extension.
@SimonCropp Depends on the expected usage. Having a choice complicates things. I would lean towards binary and sacrifice ability to browse the data in SSMS.
@SimonCropp not sure about the UoW semantics. What are pros/cons?
not sure about the UoW semantics. What are pros/cons
- we can do a
Task.WhenAllfor all operations. - no expectation of "after a write is done it can be read later in the code"
- just seemed more sensible to do all the writes in one location
happy to be dissuaded.
Depends on the expected usage. Having a choice complicates things. I would lean towards binary and sacrifice ability to browse the data in SSMS.
i was contemplating having a column for each. so u can pick. eg if u r serializing an instance put it in the text column, if u r adding an image put it in the binary column
Do we actually have usage scenarios for this? Any customers requested it? I've always thought that sagas are coordinators, rather than for moving data around. Won't we be encouraging people to create themselves more challenges down the road? or are we talking here more about transport-native databus, not necessarily related to saga?
Depends on the expected usage. Having a choice complicates things. I would lean towards binary and sacrifice ability to browse the data in SSMS.
i was contemplating having a column for each. so u can pick. eg if u r serializing an instance put it in the text column, if u r adding an image put it in the binary column
Could you use a few different columns of various types for one saga? Wouldn't that again encourage some undesired behaviours (e.g. putting too much logic into sagas)?
@weralabaj I've listed two real-world usage scenarios in the description. I don't have actual customer names though.
I agree with you that storing an image or a video in saga data might be a sign of saga doing to much. What do you think @SimonCropp ?
As for the scatter/gather, I still believe this is a valid user scenario for a saga. But given you challenge the "store large binary blob" scenario, if we are left only with scatter/gather then attachments might not be the best way to address it.
I don't have enough real-life experience to judge if that's a good or bad idea, just pointing out we'll be encouraging it :) Do we need to consult with somebody? Udi?
@weralabaj any scenario where a saga is collecting multiple binary assets in different steps. for example adding multiple images as part of some kind of submission. just because you want to update some other saga data properties, we want to avoid having to stream all the large assets from the db and the stream them back on save
An example of a customer that may be able to find this useful - Australian Prime Minister & Cabinet, but really any agency that is driven by documents.
I am currently working on a system that converts Word/Excel/PDF document to html and pdf (it it isn't already) for web presentation and printing using Aspose. When that's done I have to inspect a list of people that are interested in the change and send email notifications to them.
So we use sagas and handlers for managing the task of downloading the word document from SharePoint, doing the conversions, uploading the documents to the web host and a file system for printing and publishing, also placing copies of documents and their metadata in an EDRMS, then issuing the notifications.
We use SQL Transport and Persistence, Thanks for getting us off nHibernate by the way, the native SQL stuff is so much better.
@petersgiles We have similar requirements to your project and are also using NSB with SQL transport. Is there a need to store the document in the saga? Or just access the document from the Authoritive source when required to process it? Are you passing documents around via messages using the NSB DataBus?
@RayMartinshair If we are processing the same/similar artefact more than once. Further, it could save us issues around filesystem infrastructure.
So this is kind of how we would do it now.
But if attachments were available we might create a saga and manage the entire life cycle of the document. We could store multiple versions enabling roll back on conversions. We could do CRC checks to ensure changes have actually occurred. As well as any number of business rules in the processing of changes to the document.
Thanks @petersgiles :)
@petersgiles Why not a saga started by Document Changed Event. The file is downloaded and stored (you are doing this now). FileDownloadedEvent emitted with the converters handling this event? Is there a need to store converted versions? SharePoint stores document versions and if you stored the version number of the document you are converting allowing roll back by reconverting?
@petersgiles @RayMartinshair i think we should discuss this over beer ;)
@RayMartinshair you're assuming Sharepoint is the best place to manage data authority during document lifecycle. Its only required during editing/collaboration. Also a 'document change' may be to document metadata, or security access etc. In which case the document would need to be stored again in the EDRMS saving us the effort of download and conversion.
@SimonCropp yep good idea