Use random filenames or hashes for blobstorage
Recently there was an accident with a chatbot that replaced its avatar set from the command line with an unrelated avatar of a contact. Both the selfavatar setting and the contact avatar i param pointed to $BLOBDIR/avatar.png at the time it was detected. How this happened is unclear, but it is possible that avatar.png was removed, unmounted or otherwise not detected by the core, and the core stored avatar received from the contact as avatar.png, while selfavatar config still pointed to $BLOBDIR/avatar.png.
Such bugs are unavoidable even if the core itself has no bugs as we cannot rely on blobdir not reside on the faulty network filesystem, being incorrectly backed up and restored etc., so we should assume that files may be randomly removed. Then there may be dangling $BLOBDIR/... references in the database which may accidentally point to unrelated files, could even be an avatar.png file sent to the bot in private.
To prevent such bugs, we should choose a random filename for the blobdir objects or as a hash of the contents.
Related issue: #3817
There is an attempt to fix at #4555, but it does not consider the discussion at #3817, the issue closed in favor of this one.
As a first step I suggest adding Param::Filename in addition to Param::File, to store the original filename. This will be backwards-compatible because if Param::Filename is missing, we can still extract the filename from Param::File which stores the filesystem path to the blob. This Param::Filename should be used as an attachment filename in sent emails so random suffix is not sent over the network.
As a second step we need new APIs to avoid leaking the blob path to the UI, I have drafted them at https://github.com/deltachat/deltachat-core-rust/issues/3817#issuecomment-1471744460
Once blob paths are not visible in the non-deprecated API anymore and are not sent over the network, we can merge #4555 or even use completely random filenames.
...This
Param::Filenameshould be used as an attachment filename in sent emails so random suffix is not sent over the network.
Will the attachment filename be also encrypted to not leak the file name?
On Wed, Jul 19, 2023 at 14:37 -0700, dumblob wrote:
Will the attachment filename be also encrypted to not leak the file name?
yes, typically the mime attachment is fully contained in the encrypted part, including the filename.
This will also allow us to get rid of the sanitize-filename dependency.
There's also a old issue (which I can't find now) that's about preserving the original filename in an attachment, when a name is suggested to save it. If working on this it makes sense to ensure that also works.
There's also a old issue (which I can't find now) that's about preserving the original filename in an attachment, when a name is suggested to save it. If working on this it makes sense to ensure that also works.
There is a recently merged PR doing this: https://github.com/deltachat/deltachat-core-rust/pull/4555. But UI should start using dc_msg_get_filename() for the "Save as" dialog then. And also we need API for opening files in the blobstorage that returns a path to a file with original name, probably in some tmp dir.
The next steps are:
- Merge #4583, it adds
dc_msg_save_file()+ random filename suffixes in the blobstorage. - Change dc_msg_get_file(), see https://github.com/deltachat/deltachat-core-rust/issues/3817#issuecomment-1471744460:
dc_msg_get_fileAPI should create a temporary directory inside the blobdir, save a copy with unmangled filename, and return its path. In the housekeeping we regularly clear these directories. This API is kept for compatibility, but deprecated. - Add some new API like
dc_msg_get_filedata_path()that avoids unnecessary copying if an app needs just to open a file. - Save files in the blobstorage with completely random filenames.
- Switch the apps to these new APIs.
We already have dc_msg_get_filename() so UIs can use it for the "Save as" dialog.
EDIT: Currently everything is done in the mentioned PR except "Switch the apps to these new APIs.".