WopiStorage: after downloading a file, check if it is a zip
For relevant file extensions.
Change-Id: If8b5a1c7057118adc9154b156bb5a82057a70fd4
- Resolves: #
- Target version: master
Summary
To help detect wopi server-side errors.
Checklist
- [x] I have run
make prettier-writeand formatted the code. - [x] All commits have Change-Id
- [x] I have run tests with
make check - [x] I have issued
make runand manually verified that everything looks okay - [ ] Documentation (manuals or wiki) has been updated or is not required
what does this solves?
what does this solves?
It does not solve anything. It warns of bad WOPI server implementation. This could be turned into a hard error, surfaced to the user, but I'd like to do it later.
You should file a follow-up issue to resurface the error. Having it in the log right now should help debugging though.
I was wondering if we couldn't check this before saving the response data onto disk. To avoid reopening to check the first few bytes. We should have enough info...
Thanks, @meven. Since the commit doesn't say why we need to do this, perhaps I'm missing the goal, but...
I thought it was pretty self-explanatory. If a Wopi implementation misbehaves/delivers unexpected content, warn the system admininstrator.
Doing some input check doesn't cost much runtime-wise, but can help detecting issues efficiently. That's just input validation.
We do support formats that aren't zipped (e.g. PDF), and there is no guarantee/requirement that all future documents will be zipped.
My patch applies only on type extensions that are explicitly zipped file formats.
Also, there is no expectation or requirement that the files will have an extension at all, or even the proper extension. We still support loading/editing them, regardless.
This is at least a strong heuristic, any user would be confused if the file extension does not match the file content and would trigger some negative on the EFSS should this happen.
It's also unclear that this error will help anyone. Troubleshooting the storage/host is best done with an HTTP client, like curl or wget. Otherwise, this logic will log an error every so often when there is nothing wrong (e.g. when the user loads a text file with the wrong extension, which we do support), it will also slow loading all documents with those extensions, and (worst of all?) it can throw exceptions and break things badly.
Perhaps I missed the point of the patch.
Except a WOPI host could have some random bugs, due to load for instance, that would prevent normal business. If we can first detect it, and then perhaps later have an explicit error, that would shorten the support phase. Now we can easily pinpoint the issue origin.
sha hash checking might be useful too.
this should be rebased.