unstorage icon indicating copy to clipboard operation
unstorage copied to clipboard

feat(drivers): webdav driver

Open shareable-vision opened this issue 2 years ago • 13 comments

🔗 Linked issue

Issue #275

❓ Type of change

  • [x] 📖 Documentation (updates to the documentation, readme, or JSdoc annotations)
  • [ ] 🐞 Bug fix (a non-breaking change that fixes an issue)
  • [ ] 👌 Enhancement (improving an existing functionality like performance)
  • [x] ✨ New feature (a non-breaking change that adds functionality)
  • [ ] 🧹 Chore (updates to the build process or auxiliary tools and libraries)
  • [ ] ⚠️ Breaking change (fix or feature that would cause existing functionality to change)

📚 Description

Webdav driver for unstorage powered by the webdav NPM package.

📝 Checklist

  • [x] I have linked an issue or discussion.
  • [x] I have updated the documentation accordingly.

Considerations

The driver currently only implements read-only functions.

To-do

  • [ ] Implement setItem and removeItem, leveraging putFileContents and deleteFile

shareable-vision avatar Jul 23 '23 22:07 shareable-vision

Live Preview ready!

Name Edit Preview Latest Commit
unstorage Edit on Studio ↗︎ View Live Preview a01ef02ab91e19f5d89b471fea1df143c207a6f1

nuxt-studio[bot] avatar Jul 23 '23 22:07 nuxt-studio[bot]

Driver (read-only) is tested and working with Nuxt Content.

shareable-vision avatar Jul 24 '23 10:07 shareable-vision

Hi dear @shareable-vision i love the idea of supporting webdav as a driver!

Checking the webdav npm package used, it seems little bit overcomplex. I think we probably can directly implement a webdav client only using native fetch API.

An (old) example to inspire: https://github.com/ai-zelenin/webdav-client-fetch/blob/master/ts/index.ts

Do you think you can try to reimplement it with fetch only and no dependencies? 🙏🏼

pi0 avatar Jul 24 '23 14:07 pi0

Do you think you can try to reimplement it with fetch only and no dependencies? 🙏🏼

That's a great idea. I will work on it.

shareable-vision avatar Jul 24 '23 20:07 shareable-vision

WebDav server sends XML data. Although testing with NextCloud, I'm cross-referencing with other authorities and avoiding any namespaced properties, keeping it minimal and hopefully generally compatible.

Native fetch is great . .. but there is not a Node equivalent of browser-native DOMParser.

If we can use xml2js, then I'm almost done re-implementing read-only driver. Installed in an empty repo, xml2js clocks in at about 250kb. Or xpath.

Any idea for reliably parsing XML with no dependencies? —folks highly discourage parsing XML with regex. What do you think?

—I'm really glad you put me onto this!

shareable-vision avatar Jul 25 '23 02:07 shareable-vision

After taking time to consider, I don't think we should use a manual regex approach for parsing the XML for the sake of having zero dependencies, while I really appreciate that you got me to thinking about the efficiency of the package that we're leveraging.

WebDav is a standardization of XML —the parsing of the XML responses is key to the function of the driver, and it is reasonable to require that a user install an additional peer dependency.

fast-xml-parser looks like it will be a good fit.

shareable-vision avatar Jul 25 '23 20:07 shareable-vision

Hi dear @shareable-vision. Sorry for answering late and thanks for willing to continue working on this.

Nice point about XML parser lacking. Looking at webdav dependencies, they use fast-xml-parser i would recommend to try it.

Usng bundlephobia.com, fast-xml-parser is 25KB vs JSDom 2.4 MB.

We might find something even smaller but i think that's a good start.

pi0 avatar Jul 25 '23 20:07 pi0

Just pushed re-implementation of read-only version of the Webdav driver using 'fast-xml-parser'.

I will work on implementing setItem() and removeItem() next.

Will have to study Webdav resource locking.

shareable-vision avatar Jul 27 '23 04:07 shareable-vision

I want to work on this a little longer and make some improvements. One very major improvement that I see now will be to use the Depth header —currently implementation is recursively re-fetching subdirectories and awaiting a Promise.all. I tried the header with a faulty value ('Infinity') —now I see that a value of 'infinity' will eliminate the re-fetching and speed up getKeys() —specs say that server should support "infinity requests" —could implement re-fetching as a fall-back mechanism, but it's probably not necessary.

I'm also mulling over the advantages of polling. Will study the other drivers more about this . . I think where I'm landing is that it would be good to provide optional caching and polling, which are each disabled with a default value of 0.

WebDAV uses 'etag' to version files. When re-syncing does need to happen, the etag can be referenced to quickly validate the cache.

shareable-vision avatar Jul 27 '23 04:07 shareable-vision

Thanks for updates and nice work on this @shareable-vision

I have marked PR as draft in meantime please ping when felt ready

I will also check locally in coming days and try to help if fine for you 👍🏼

pi0 avatar Jul 27 '23 15:07 pi0

Perfect :+1:

Did implement Depth: "infinity" header as default with config option to fallback if unsupported —much better.

Will do write functions as best as I can without getting into the weeds about locked resources.

Will touch base soon.

shareable-vision avatar Jul 27 '23 22:07 shareable-vision

Almost ready for review. Open to feedback.

setItem() and removeItem() are implemented, honoring all WebDAV specs requirements.

setItem() will ensure that the target parent directory exists by making MKCOL requests on each non-existent ancestor directory. If 'type' key on opts is set to "directory", setItem() will make a directory on WebDAV storage instead of uploading a file.

WebDAV defines etag as a versioning mechanism that can be used for cache validation. When getKeys() updates meta for an item, if the etag changes, then the item.body is deleted, thus is no longer "cached".

Concerns:

Can optimize PUT request?: I'm not sure if the PUT requests may be optimized by using a ReadableStream. Probably because they are already utf8-encoded, this won't make any difference . .

What should be the desired behavior of the ttl key and caching? In my mind, the most optimal solution for WebDAV is for the consumer of the driver to receive a Webhook from the WebDAV service and then intentionally re-fetch keys/meta by calling getKeys(). Unstorage itself is the "cache". getItem() and getMeta() should immediately return available data, fetching only as needed. There should be no reason to repeatedly call getKeys() unless the consumer is expecting the values to have changed, or else are unsure and are polling the source.

My thinking previously was that an undefined ttl might signify ("Do not cache"), however another reasonable interpretation would be (TTL of zero or undefined => Always re-fetch), which would be the opposite of my intention.

Polling? I haven't implemented any polling (which might be configurable via an 'interval' key). I'm not sure if it's a good idea. It seems like the driver consumer can do that if they wish simply by re-calling getKeys() as needed. If polling were implemented, the consumer would have to provide a callback function to be called when source changes. Actually configuring an interval, or manually polling may be more efficient than using TTL —when polling, the fetching is happening in the background, not in the middle of a user's request.

shareable-vision avatar Jul 30 '23 01:07 shareable-vision

Driver is ready for review.

shareable-vision avatar Jul 30 '23 16:07 shareable-vision