offlineimap
offlineimap copied to clipboard
changing nametrans on a local folder invalidates all uids.
Local Maildir folders use the visiblename for computing their md5 hash. See here. In case there is a local nametrans, this is the folder name translated to remote, but with local separator.
This md5 hash is embedded in the maildir files. Additionally, when copying messages to a remote, it is checked whether the message md5 coincides with the actual folder md5. If not, the local uid is made negative, which means the message will get reuploaded, and assigned a new uid on the remote. See this.
Since this has to do with uid validity, which is a remote thing, It probably makes sense to use the md5 hash of the remote folder name.
However, it has unexpected consequences. Changing the local nametrans will invalidate all uids and reupload all messages. This might happen when a user with a partial nametrans only on the remote, adds a local one to have back&forth consistency. The fact that this invalidates all uids is quite unexpected.
I'm not sure what are the options here. Since apparently there has already been a md5 hash migration, we could probably do another one and transition hashes to local folder name.
Oh, that migration thing was precisely to move md5 sums to a nametrans-ed folder name, in order to avoid this exact problem. See c84d23b.
I fully agree with you. The MD5 migration option was introduced by @iliastsi because the FMD5 was unexpectedly changed by introducing another feature. I bet this MD5 will hurt again in the future.
At first, we might think that relying on anything other than the raw (fully qualified) remote folder name to compute the hash is wrong. I agree with this, too.
However, we could go further. AFAICT, the FMD5 was introduced to allow copying/moving emails locally without having to update the filename. Notice this is a safe guard that makes sense for Offlineimap only.
If we look at the bigger picture, I'd say that the FMD5 is always wrong. IMAP provides the UIDVALIDITY for the folders and this is what Offlineimap should use. I'm not convinced we should try to be smarter than what IMAP provides in this area.
If we look at the bigger picture, I'd say that the FMD5 is always wrong. IMAP provides the UIDVALIDITY for the folders and this is what Offlineimap should use. I'm not convinced we should try to be smarter than what IMAP provides in this area.
I'm not sure the uidvalidity would be enough. See https://tools.ietf.org/html/rfc3501 section 2.3.1.1
- The combination of mailbox name, UIDVALIDITY, and UID must refer to a single immutable message on that server forever.
The uidvalidity, uid pair is guaranteed to be unique forever in a given folder. It says nothing about other folders in that server.
The uidvalidity, uid pair is guaranteed to be unique forever in a given folder. It says nothing about other folders in that server.
Right.
If we change the FMD5 meaning (which would make sense), I think we should start versioning the filename.
If we change the FMD5 meaning (which would make sense), I think we should start versioning the filename.
Which new meaning do you refer to? I'm not really sure we should change the FMD5 again.
If we don't, people who change the local nametrans will get bitten, and we need to address that. If we do, people currently using a local nametrans will get bitten, and we will need to address that. I guess we should do whatever will cause less pain overall.
What do you mean by versioning? Add a version number for the "filename format"? If we end up changing the filename format again, maybe we should do that, yes... but let's make sure we need to change it first, since it is quite an invasive thing to do. I'm sure there is people out there relying on details of the file format, etc.
What do you mean by versioning? Add a version number for the "filename format"? If we end up changing the filename format again, maybe we should do that, yes... but let's make sure we need to change it first, since it is quite an invasive thing to do. I'm sure there is people out there relying on details of the file format, etc.
Yes, I meant using something like V001,FMD5...
in the filename.
I do believe the best FMD5 is the remote fully qualified folder name. If the Maildir nametrans is added/changed, it must match the remote name. Actually, the FMD5 should be nametrans independent.
I do believe the best FMD5 is the remote fully qualified folder name.
Agreed, in principle. In practice, I'm not convinced it is worth the pain. In a sense, it is the worst possible change, since we would break everybody's maildir, for comparatively little benefit. It concerns only people playing tricks with multiple remotes on a single local maildir, and only in some scenarios. Not the case of migration from one remote to another which is empty, for example.
If the Maildir nametrans is added/changed, it must match the remote name.
Do you mean enforcing some constraint on the local nametrans? The only thing that occurs to me is the strict back&forth constraint we talked about earlier.
That would tie the local nametrans to the remote and make it stable, but we still have a transition period in which people would change the local nametrans to comply with the constraint.
Actually, the FMD5 should be nametrans independent.
Agreed in principle. However, I don't think we can do it and use remote folder names at the same time. How would we determine the folder name on the remote then? We only have the nametrans, or the assumption that it is the same in case of no nametrans.
Agreed, in principle. In practice, I'm not convinced it is worth the pain. In a sense, it is the worst possible change, since we would break everybody's maildir, for comparatively little benefit. It concerns only people playing tricks with multiple remotes on a single local maildir, and only in some scenarios. Not the case of migration from one remote to another which is empty, for example.
I agree. If we go this road, we have to handle the migration seamlessly.
If the Maildir nametrans is added/changed, it must match the remote name.
Do you mean enforcing some constraint on the local nametrans? The only thing that occurs to me is the strict back&forth constraint we talked about earlier.
That would tie the local nametrans to the remote and make it stable, but we still have a transition period in which people would change the local nametrans to comply with the constraint.
There are two different constraints:
- Force the users to have reversed nametrans. I'm not in favor of this.
- Force the reversed nametrans to be correct regarding the remote nametrans. That's the best IMHO.
Actually, the FMD5 should be nametrans independent. Agreed in principle. However, I don't think we can do it and use remote folder names at the same time. How would we determine the folder name on the remote then? We only have the nametrans, or the assumption that it is the same in case of no nametrans.
What about getting rid of the whole FMD5 thing? If I'm right that the FMD5 was introduced to detect copy/move of local emails, assigning whatever uniq number to each local folder would do the trick.
FMD5 was about detecting "foreign" emails: 97f39b5ea8b6da53f4fef53fe4b56544f1713ee4
Force the reversed nametrans to be correct regarding the remote nametrans. That's the best IMHO.
You mean, constrain if it exists, but allow for no local nametrans, isn't it? Then I agree this would be best.
Regarding the rason for the FMD5 thing. Hmm... Imagine there are two folders on the server, with independent uids running in parallel. The uidvalidity has nothing to say here, since it only concerns a single folder for the eternity. So, there may be different messages X on folder A, and Y on folder B, with the same uid and uidvalidity.
Then I assumed the fmd5 entered the game so that if the user moves X to folder B, the local maildir notices non-matching FMD5 and reports a negative uid, so that the message is uploaded on folder B in the remote, and assigned a new uid. Without this, offlineimap would have seen local X with the same uid as remote Y in folder B, and done nothing.
Is this picture correct?
If this is right, then "folder name on the remote" is the correct choice for the FMD5. We may not be able to do it in practice due to missing local nametrans, so we might think about a more practical less correct variant, like using a md5 hash of the local folder, which was the state of affairs before inadvertently changing the fmd5 meaning. Or we could use whatever id we can assign to a folder locally, but I see no advantages of an ID over a local folder hash.
You mean, constrain if it exists, but allow for no local nametrans, isn't it?
Yes.
Then I agree this would be best.
Regarding the rason for the FMD5 thing. Hmm... Imagine there are two folders on the server, with independent uids running in parallel. The uidvalidity has nothing to say here, since it only concerns a single folder for the eternity. So, there may be different messages X on folder A, and Y on folder B, with the same uid and uidvalidity.
You were right to say the UIDVALIDITY does not guarantee to be distinct over folders. I've discarded this option since then.
Then I assumed the fmd5 entered the game so that if the user moves X to folder B, the local maildir notices non-matching FMD5 and reports a negative uid, so that the message is uploaded on folder B in the remote, and assigned a new uid. Without this, offlineimap would have seen local X with the same uid as remote Y in folder B, and done nothing.
Is this picture correct?
Yes.
If this is right, then "folder name on the remote" is the correct choice for the FMD5. We may not be able to do it in practice due to missing local nametrans, so we might think about a more practical less correct variant, like using a md5 hash of the local folder, which was the state of affairs before inadvertently changing the fmd5 meaning. Or we could use whatever id we can assign to a folder locally, but I see no advantages of an ID over a local folder hash.
Hashing the remote folder name would not be wrong. I do agree this would be better than the local folder name. Notice this would still prevent easy remapping of the local folder to the remote on remote folder rename.
Most importantly, as soon as the name of whatever is used for the hash changes (sometimes due to the user, sometimes due to code changes), the users are exposed to unexpected FMD5 changes. This is why I think the best is to assign our own uniq folder ID.
Actually, this is something I have in mind for imapfw: store whatever ID on a per local folder basis for the local side. Something like
$ cat path/to/maildir/foldername/.offlineimap
FOLDER_ID: 1234 # Guaranteed uniq over folders and over maildirs.
Does this worth the change for offlineimap? I don't know.
Forgot to say that the latest option I'm talking about would fix the current issue.
Most importantly, as soon as the name of whatever is used for the hash changes (sometimes due to the user, sometimes due to code changes), the users are exposed to unexpected FMD5 changes. This is why I think the best is to assign our own uniq folder ID.
Ah! I see.
It depends on how imap servers handle folder renames. If a folder rename is subject to invalidate uids, then our own folder ID would defeat the purpose of the fmd5.
And I think folder renames on imap can invalidate uids. Let's say we rename a folder to destination name that coincides with an old deleted folder. Does the uidvalidity eternal promise still stand on this resurrected folder? If it does, then the old uids are invalidated for sure.
It depends on how imap servers handle folder renames. If a folder rename is subject to invalidate uids, then our own folder ID would defeat the purpose of the fmd5.
And I think folder renames on imap can invalidate uids. Let's say we rename a folder to destination name that coincides with an old deleted folder. Does the uidvalidity eternal promise still stand on this resurrected folder? If it does, then the old uids are invalidated for sure.
Remote folder renaming is one of the reason why having FMD5 tied to the remote name might not be ideal. Since the purpose of the FMD5 is NOT to allow easy remapping, I'm not considering this use case alone. ,-)
I observe that the FMD5 is prone to changes because of how it is built. This is not the first time FMD5 is hurting while it should not. E.g. the --migrate-fmd5-using-nametrans
CLI option written by @iliastsi is there only because we are having hard time ensuring FMD5 consistency over time. This bug report is a good example about why the current strategy hurts, too.
Hence, the idea of using a full new strategy by not relying on the folder names (local or remote) to correctly handle local copy/move of emails.
I would say the mission of the FMD5 is invalidate uids if you do something locally that can cause uid clashes on the remote, and lead to a fake synced state (offlineimap thinks it is synced, but some uids refer to different messages on the local and remote).
Doing a folder rename locally, is one of such things (I believe). The target folder name may exist on the remote, and uids may collide.
I agree with you when you say that we should be able to handle folder renaming independently of the FMD5. What I'm saying is that I don't think we can, and this is a shortcoming of IMAP, at the end. If IMAP had a notion of folder ID, stable across folder renames, we could use that, and be happy!
I just encountered this issue myself today.
I have the following "local nametrans"
nametrans = lambda folder: {'Archive': '[Gmail]/All Mail',
'Drafts': '[Gmail]/Drafts',
'Sent': '[Gmail]/Sent Mail',
'Spam': '[Gmail]/Spam',
'Trash': '[Gmail]/Trash',
}.get(folder, folder)
.. and the following "remote nametrans"
nametrans = lambda folder: {'[Gmail]/All Mail': 'Archive',
'[Gmail]/Drafts': 'Drafts',
'[Gmail]/Sent Mail': 'Sent',
'[Gmail]/Spam': 'Spam',
'[Gmail]/Trash': 'Trash',
}.get(folder, folder)
After a successful sync with offlineimap -o
any subsequent run of the same would result in negative UID and upload of same messages back to the server.
Are there any known workarounds for this or is using nametrans
simply not possible atm?