deltachat-core-rust icon indicating copy to clipboard operation
deltachat-core-rust copied to clipboard

don't lowercase blob extensions

Open dotlambda opened this issue 3 years ago • 2 comments

Some programs require specific casing, e.g. .Rmd.

fixes https://github.com/deltachat/deltachat-desktop/issues/2685

cc @dumblob

dotlambda avatar Apr 16 '22 20:04 dotlambda

haven't looked at the involved code parts for longer but conceptually i'd expect that we keep and re-use the "display" filename on receiving attachments, and when sending them out, use those original displaynames. That we normalize the names for use as keys in databases, or for determining filenames is an orthogonal thing. For the purposes for @dotlambda's PR at hand here, i agree with @r10s that making ".R" and ".Rnd" into exceptions when lower-casing might be an option to remedy the immediate issue at hand.

hpk42 avatar Apr 17 '22 10:04 hpk42

i agree with @r10s that making ".R" and ".Rnd" into exceptions when lower-casing might be an option to remedy the immediate issue at hand.

Let me note that now that I (and some of my friends) know about this issue in DC, we actually do not need any urgent fix. So I'd prefer if this got solved properly without workarounds despite the fix taking (much) more time to implement.

dumblob avatar Apr 17 '22 12:04 dumblob

Friendly ping if there is anything new going on here :wink:.

dumblob avatar Nov 07 '22 15:11 dumblob

@r10s how are attachments even exposed on the C API? I guess this is really where this matters, internally it does not so much. I had a quick look but couldn't find this in dc_msg_t

flub avatar Nov 09 '22 20:11 flub

Personally I could be found for somehow entirely preserving original file names. Most MUAs behave this way, and OSes have always had to deal with whatever crazy filenames come out of this. Most people will have files named .JPG and .JPEG somewhere on their computer or device. I'm not sure deltachat needs to have an opinion on what is valid. It can just pass along some bytes.

That of course is slightly more work, but there is already the context of potentially removing the blobdir in favour of storing blobs in some other storage mechanism.

flub avatar Nov 09 '22 20:11 flub

how are attachments even exposed on the C API?

dc_msg_get_file(), dc_chat_get_profile_image(), dc_get_config(selfavatar) are the ones for reading blob files.

EDIT: leaving the name completely as is, however, would only be doable if we store things not directly in the file system; sth. that may come along with iroh, however, this is a bit farer away and needs some more things to consider. so, maybe for now, chose the most pragmatic approach that does not make things more difficult in the future.

so, yes, maybe not lowercasing as suggesting in this pr is okay, however, more things need adaptions then and required no-case-comparison. some of these places are mentioned above, at least these should be targeted before merging - and other potential places should be checked.

for using blobs via ffi or jsonrpc: i think that this is not a huge issue, i agree that most instances processing the files futher will deal with the weird name variants.

r10s avatar Nov 09 '22 21:11 r10s

This is probably not the best place to ask - but why let this functionality depend on IPFS/iroh (now or in the future)? Or have I misunderstood?

I mean, IPFS protocol is known to be bandwidth-hungry (a significant issue for mobile world), rather high-latency and actually have other downsides (NAT traversal is still not mature as we speak etc.) compared to e.g. faster and generally better Hypercore (which shall also have slightly nicer API).

dumblob avatar Nov 10 '22 09:11 dumblob

This is probably not the best place to ask - but why let this functionality depend on IPFS/iroh (now or in the future)? Or have I misunderstood?

This issue is not about IPFS, so it's indeed better discussed elsewhere. At this point, there are only experiments like https://github.com/deltachat/deltachat-core-rust/pull/3489 to introduce parts of libp2p and IPFS for multi-device setup. Iroh is a Rust implementation that performs a lot better. As far as i know there is no fully working Hypercore Rust implementation so it's not an option (feel free to investigate). However, it's really better to discuss this out of this PR -- on community channels. Hope that clarifies a bit.

hpk42 avatar Dec 04 '22 20:12 hpk42

EDIT: leaving the name completely as is, however, would only be doable if we store things not directly in the file system;

Managing blob metadata in the database would help keeping original filenames. The filenames of blob data could stay exactly as they are now, as internal sanitized names. By itself, it's currently probably not a high prio refactoring. However, keeping blob metadata would also help with deduplication. On my chat account setup, deduplication would single-handedly save 11% of storage. Just by having multiple rows in a hypothetical blobmeta table point to the (however named) blob-content file.

hpk42 avatar Dec 04 '22 20:12 hpk42

k, so i created this issue https://github.com/deltachat/deltachat-core-rust/issues/3817 and close this PR.

hpk42 avatar Dec 06 '22 22:12 hpk42

@dotlambda see also my comment if you like to address this in a shorter term. I indeed agree this doesn't need to wait on a storage rewrite.

flub avatar Dec 08 '22 13:12 flub