wildduck icon indicating copy to clipboard operation
wildduck copied to clipboard

added base changes to support s3

Open bhanuc opened this issue 3 years ago • 4 comments

Aim: to add support for s3 and other s3 compatible storage layers(Like DO Spaces etc) . Issue was already discussed in #33 .

This PR is aimed at getting primary feedback on the design and feedback and also gather requirements/needs this project might have. There are still some places and methods that still needs to be updated.

Also I have not tested it fully and I plan to write tests once I get some feedback on this.

bhanuc avatar Apr 26 '21 15:04 bhanuc

Well yes, the storage system was initially meant to support different providers like that.

Adding all these storage drivers into WildDuck though would end up with a huge amount of unneeded dependencies (eg. someone using standard GridFS still has to install AWS SDK) so a better approach would treat these mechanisms as plugins.

So what would need to be different:

  • GridFSDriver as a standard would be baked in
  • The storage handler driver would either require the built in gridfs driver or an external dependency based on cofngiguration (eg const SelectedDriver = require('wildduck-s3-driver'))

Plugin's package.json would list WildDuck as a peer dependency

{
  "name": "wildduck-s3-driver",
  "peerDependencies": {
    "wildduck": "1.33.2"
  }
}

So basically everything would be the same way you already did this except that the S3 driver would be a separate plugin.

andris9 avatar Apr 30 '21 12:04 andris9

(using fixed peer dependency versions number is not a good idea though. Having something like "1.x" would probaby be better. Here's some more information)

andris9 avatar Apr 30 '21 12:04 andris9

Btw this PR only covers Draft composing storage but not actual attachments. The main differences between attachments and draft storage are:

  • attachments are deduplicated, draft storage is not. This is not a hard requirement though as the deduplication is handled by the driver. GridFS does this by using sha256 hash of the attachment file as the ID. You could just as well generate a new ID for each attachment and ignore deduplication entirely.
  • attachments are stored as decoded binary files. Yet again, not a hard requirement but allows to keep storage size down (base64 encoded attachment takes at least 33% more storage than a decoded binary file)
  • attachment content can be accessed partially. This is needed by IMAP where clients tend to do partial message downloads. Clients sometime ask messages in small chunks so you do not want to fetch a 20MB attachment just to return the requested 64kB part of it. Requested range is for the original form of the attachment, so if you have decoded attachment into a binary file then you have to calculate actual bounds to get the needed part for base64 content which is tricky but the current GridFS attachment storage handles this already as a reference.

andris9 avatar Apr 30 '21 12:04 andris9

This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 20 days.

github-actions[bot] avatar Feb 18 '24 01:02 github-actions[bot]

This PR was closed because it has been stalled for 20 days with no activity.

github-actions[bot] avatar Mar 13 '24 01:03 github-actions[bot]