bionode-watermill icon indicating copy to clipboard operation
bionode-watermill copied to clipboard

Integrate nodestream

Open thejmazz opened this issue 7 years ago • 2 comments

nodestream - Storage-agnostic streaming library for binary data transfers

This is beneficial because it can move the "tee to a file" work off of us perhaps, but more importantly, agnostically provides transfer to various cloud services. For now, can just use it for local filesystem.

Also need to consider how tasks produce output. If it is a program that takes an outputFile as params for example, do we need to create a readstream on that file as it is created to produce an outgoing stream of it?

Not sure if enhancement or feature, it is a bit of both - not entirely necessary for MVP - but very useful to have.

thejmazz avatar Apr 09 '17 18:04 thejmazz

Thanks @thejmazz, seems like an interesting project worth watching! Yes, we could use it to add more features (unified cloud storage) or enhance existing code (local file storage, transforms). So if we move forward I think we'll need more specific issues, but for now I think this is just a discussion. Maybe some of that discussion can happen on the Gitter channel, but at a first glance, these are the question I have:

  • How easy would it be to combine nodestream with other modules that are just regular Streams?
  • The examples shown seem to use promises, would that be a problem (e.g., make our code less Streamable)?

bmpvieira avatar Apr 10 '17 10:04 bmpvieira

Ah, did not realize that. If this is used just for uploading/downloading files before/after tasks, having a Promise API is not a problem (it can be like a stream that emits one chunk and finishes), but the underlying stream is there (with transforms) - which would be nice to have access to, and we can handle the finish ourselves (though also looks like there is some transformer based results.

thejmazz avatar Apr 10 '17 14:04 thejmazz