NServiceBus.Persistence.Sql icon indicating copy to clipboard operation
NServiceBus.Persistence.Sql copied to clipboard

Add support for attachments

Open SzymonPobiega opened this issue 8 years ago • 20 comments

Attachments are pieces of data a saga can access on demand (are not loaded when main saga data is loaded).

Attachments can be anything: a piece of data, an ID, a picture, a video.

Saga API should allow to create, modify and query attachments using a stream API. The stream consumption and production is handled via callbacks so that the stream does not leak outside of the API calls:

//Read attachment passing in a func for processing the stream. 
//If no attachment is found by this ID, return null
Attachment<MyValue> v = await context.ReadAttachment<MyValue>(
   "SomeID", dataStream => Deserialize<MyValue>(dataStream));

MyValue o = v.Data; //access the deserialized data
int version = v.Version; //access the optimistic concurrency version

//Serialize is Func<Task<Stream>>
await context.AddAttachment("SomeOtherID", () => Serialize(o)); //Fails of attachment already exists

await context.UpdateAttachment("SomeOtherID", version, () => Serialize(o))); //Fails is attachment does not exist or the stored version does not match the parameter value

var list = await context.ListAttachmentsIds();

Usage scenarios:

  • Scatted/Gather. Each work item state is stored as a separate attachment to avoid congestion on the saga state information. The work is completed when ListAttachmentIds() returns the expected number of attachments.
  • Storing large data (e.g. photos, videos) which is not used only in small part of saga logic.

SzymonPobiega avatar Feb 16 '17 13:02 SzymonPobiega

@Particular/sqlpersistence-maintainers please comment on the API proposal?

SzymonPobiega avatar Feb 16 '17 13:02 SzymonPobiega

i dont think the writes would be async? wouldnt they queue up in memory and write at the end of the UOW?

SimonCropp avatar Feb 17 '17 10:02 SimonCropp

in terms of the column type. string, binary or both?

SimonCropp avatar Feb 17 '17 10:02 SimonCropp

minor but this. wont work since it will be a context extension not a saga extension.

SimonCropp avatar Feb 17 '17 10:02 SimonCropp

@SimonCropp Depends on the expected usage. Having a choice complicates things. I would lean towards binary and sacrifice ability to browse the data in SSMS.

SzymonPobiega avatar Feb 17 '17 11:02 SzymonPobiega

@SimonCropp not sure about the UoW semantics. What are pros/cons?

SzymonPobiega avatar Feb 17 '17 11:02 SzymonPobiega

not sure about the UoW semantics. What are pros/cons

  • we can do a Task.WhenAll for all operations.
  • no expectation of "after a write is done it can be read later in the code"
  • just seemed more sensible to do all the writes in one location

happy to be dissuaded.

SimonCropp avatar Feb 17 '17 11:02 SimonCropp

Depends on the expected usage. Having a choice complicates things. I would lean towards binary and sacrifice ability to browse the data in SSMS.

i was contemplating having a column for each. so u can pick. eg if u r serializing an instance put it in the text column, if u r adding an image put it in the binary column

SimonCropp avatar Feb 17 '17 11:02 SimonCropp

Do we actually have usage scenarios for this? Any customers requested it? I've always thought that sagas are coordinators, rather than for moving data around. Won't we be encouraging people to create themselves more challenges down the road? or are we talking here more about transport-native databus, not necessarily related to saga?

weralabaj avatar Feb 20 '17 06:02 weralabaj

Depends on the expected usage. Having a choice complicates things. I would lean towards binary and sacrifice ability to browse the data in SSMS.

i was contemplating having a column for each. so u can pick. eg if u r serializing an instance put it in the text column, if u r adding an image put it in the binary column

Could you use a few different columns of various types for one saga? Wouldn't that again encourage some undesired behaviours (e.g. putting too much logic into sagas)?

weralabaj avatar Feb 20 '17 06:02 weralabaj

@weralabaj I've listed two real-world usage scenarios in the description. I don't have actual customer names though.

I agree with you that storing an image or a video in saga data might be a sign of saga doing to much. What do you think @SimonCropp ?

As for the scatter/gather, I still believe this is a valid user scenario for a saga. But given you challenge the "store large binary blob" scenario, if we are left only with scatter/gather then attachments might not be the best way to address it.

SzymonPobiega avatar Feb 20 '17 08:02 SzymonPobiega

I don't have enough real-life experience to judge if that's a good or bad idea, just pointing out we'll be encouraging it :) Do we need to consult with somebody? Udi?

weralabaj avatar Feb 20 '17 09:02 weralabaj

@weralabaj any scenario where a saga is collecting multiple binary assets in different steps. for example adding multiple images as part of some kind of submission. just because you want to update some other saga data properties, we want to avoid having to stream all the large assets from the db and the stream them back on save

SimonCropp avatar Mar 02 '17 12:03 SimonCropp

An example of a customer that may be able to find this useful - Australian Prime Minister & Cabinet, but really any agency that is driven by documents.

I am currently working on a system that converts Word/Excel/PDF document to html and pdf (it it isn't already) for web presentation and printing using Aspose. When that's done I have to inspect a list of people that are interested in the change and send email notifications to them.

So we use sagas and handlers for managing the task of downloading the word document from SharePoint, doing the conversions, uploading the documents to the web host and a file system for printing and publishing, also placing copies of documents and their metadata in an EDRMS, then issuing the notifications.

We use SQL Transport and Persistence, Thanks for getting us off nHibernate by the way, the native SQL stuff is so much better.

petersgiles avatar Mar 15 '17 04:03 petersgiles

@petersgiles We have similar requirements to your project and are also using NSB with SQL transport. Is there a need to store the document in the saga? Or just access the document from the Authoritive source when required to process it? Are you passing documents around via messages using the NSB DataBus?

RayMartinshair avatar Mar 15 '17 09:03 RayMartinshair

@RayMartinshair If we are processing the same/similar artefact more than once. Further, it could save us issues around filesystem infrastructure.

So this is kind of how we would do it now.

diagram1

But if attachments were available we might create a saga and manage the entire life cycle of the document. We could store multiple versions enabling roll back on conversions. We could do CRC checks to ensure changes have actually occurred. As well as any number of business rules in the processing of changes to the document.

petersgiles avatar Mar 15 '17 09:03 petersgiles

Thanks @petersgiles :)

weralabaj avatar Mar 15 '17 10:03 weralabaj

@petersgiles Why not a saga started by Document Changed Event. The file is downloaded and stored (you are doing this now). FileDownloadedEvent emitted with the converters handling this event? Is there a need to store converted versions? SharePoint stores document versions and if you stored the version number of the document you are converting allowing roll back by reconverting?

RayMartinshair avatar Mar 15 '17 10:03 RayMartinshair

@petersgiles @RayMartinshair i think we should discuss this over beer ;)

SimonCropp avatar Mar 15 '17 11:03 SimonCropp

@RayMartinshair you're assuming Sharepoint is the best place to manage data authority during document lifecycle. Its only required during editing/collaboration. Also a 'document change' may be to document metadata, or security access etc. In which case the document would need to be stored again in the EDRMS saving us the effort of download and conversion.

@SimonCropp yep good idea

petersgiles avatar Mar 15 '17 17:03 petersgiles