imapbackup
imapbackup copied to clipboard
changes from forks
Are there changes that have to be applied to fix issues, like the commits with parsing multiple message IDs?
https://github.com/rcarmo/imapbackup/network (you have to scroll to the left to see all changes from all forks)
@huanwin @jrk @mystix @aehrisch @esanna-aob @ilium007 can you verify which changes should land in the current version to fix known bugs / problems?
Currently it seems the version here has bigger changes in the code so I can not directly compare the forks with the original repo (parse_bulk_fetch()
for example is not there anymore).
This is a very good question - looking at some of the forks, some changes would make a lot of sense here in master. I wonder why there aren't corresponding PRs.
@DanielRuf my changes essentially tried to get the code to a place that it's now already in with the Python 3 script + your merged changes. Thanks for your work.
My changes addressed several bugs I ran into in attempting to archive my full mail history. The commit titles are descriptive of the issues addressed, and most were small patches, but I haven't looked at the code in more than 5 years and have no idea if it's still relevant.
If this repo is active and hasn't diverged too badly, it may be worth at least scanning my bug fix commits to see what might still be relevant (they're all pretty simple):
- jrk@60b4f2c44e988548d6661ca2fcf0f5cd35fc0f6f
- jrk@ad59ae7aba6874dab4f6c113fd6f2dce409f42bb
- jrk@c3a3e5db6991338809d031dd2b7efdee1d28077d
This performance optimization is probably less trivial to merge, and only relevant as inspiration if the basic idea of batching requests for efficiency is still relevant to the current codebase: jrk@3a55e188bb7950b83a2cf9a15ee673dc8df1f74f.
If this repo is active and hasn't diverged too badly, it may be worth at least scanning my bug fix commits to see what might still be relevant (they're all pretty simple)
I guess so, today one of my PRs was merged and I think it makes sense to check which patches are still relevant and if so, what has to be done (cherry-pick the patches or rewrite for new PRs).
This performance optimization is probably less trivial to merge, and only relevant as inspiration if the basic idea of batching requests for efficiency is still relevant to the current codebase: jrk/imapbackup@3a55e18.
Personally for me that would be helpful since currently every request means much bandwidth and bad performance with my current internet connection. Batching is something that would probably drastically reduce the overhead of the many requests.
I'm happy to merge whatever makes sense and keeps the script working. These days I mostly do backups to my own IMAP server (and snapshot its storage to a cloud provider), but there is still room for people wanting to download specific mailboxes and have discrete archives, so please, feel free to submit more PRs.
I can also set up GitHub Actions to do basic testing and linting (like on @piku) if people find that useful. I just don't have much time to do reviews or full maintenance, so if PRs are concise and documented/tested to some degree it would make things easier.
(adding someone else as maintainer is also feasible, of course, but this is one script, so having "staff" on call might be overkill 😀)
@jrk I submitted #40 which should address https://github.com/jrk/imapbackup/commit/c3a3e5db6991338809d031dd2b7efdee1d28077d in case you have any input on that.
@samsonjs Seems promising
This issue is stale because it has been open for 90 days with no activity.
The PR from @samsonjs seems merged now. What next?
This issue is stale because it has been open for 90 days with no activity.
I swept through most of the commits in the network and it doesn't look like there are a lot of outstanding changes that should be upstreamed here anymore so this issue might have run its course. @DanielRuf is there anything specific you're still missing here? At least some of the changes you wanted have landed.
- https://github.com/jrk/imapbackup/commit/3a55e188bb7950b83a2cf9a15ee673dc8df1f74f appears to be essentially the same as #42 which has been merged. This improves performance of updates by batching up fetching all headers into a single request. There's still an additional request per new message so fresh copies aren't sped up as much. 4h33m to back up 41,091 emails before, and 3h23m to back up 41,106 after the patch. Incremental updates dropped from over 8 minutes to 26 seconds for about 4,400 emails.
- https://github.com/jrk/imapbackup/commit/60b4f2c44e988548d6661ca2fcf0f5cd35fc0f6f it's not obvious to me what this change does so I'm not sure if it's appropriate to upstream here or not.
- https://github.com/jrk/imapbackup/commit/ad59ae7aba6874dab4f6c113fd6f2dce409f42bb It looks like this code might handle the same case so this change might not be needed either https://github.com/rcarmo/imapbackup/blob/master/imapbackup38.py#L231-L256
Sorry I'm slow to chime in. I obviously haven't touched this stuff in a long time, but I think you're probably right about the first and third changes mostly overlapping with those other later commits.
jrk@60b4f2c was fixing a faulty assumption in the code that messages would only ever have one Message-ID
header. I found some servers occasionally returned multiple message-ids for a single message which caused the code to blow up. This worked around the problem by detecting this case and taking the first one. I'm not sure if this is still a potential issue in the mainline code.
This issue is stale because it has been open for 90 days with no activity.
This issue was closed because it has been inactive for 30 days since being marked as stale.