got-your-back icon indicating copy to clipboard operation
got-your-back copied to clipboard

Never mark spam on restore

Open aaronadamsCA opened this issue 3 years ago • 25 comments

The issue tracker is for reporting product deficiencies. How do I questions should be posted to the discussion forum at https://groups.google.com/group/got-your-back. When in doubt, start at the discussion forum and return here only when instructed to do so.

Please confirm the following:

  • I have upgraded to the latest GYB release from https://github.com/jay0lee/got-your-back/releases and I still have this issue.
  • I am typing the command as described in the GAM Wiki at https://github.com/jay0lee/got-your-back/wiki

Full steps to reproduce the issue:

  1. Back up from one account
  2. Restore to another account

Expected outcome (what are you trying to do?): All messages restored with similar structure.

Actual outcome (what errors or bad behavior do you see instead?): Thousands of legitimate messages from the first account classified as spam in the second account.

Unfortunately Gmail won't let me bulk mark them all "not spam", either, so this is a whole lot of repetitive clicking to rectify.

I see the Gmail API has a neverMarkSpam option on some endpoints, but I can't tell if it's available on the endpoint you're using because I can't read Python. 🙃

aaronadamsCA avatar Jan 21 '22 00:01 aaronadamsCA

Yes, GYB sets this parameter:

https://github.com/jay0lee/got-your-back/blob/main/gyb.py#L1971

so this shouldn't be happening. Can you provide sample messages or a sample backup that is showing this problem?

jay0lee avatar Jan 21 '22 00:01 jay0lee

I'm up to several thousand messages in spam, but thankfully I found a workaround that lets you mark more than 50 messages as "not spam" in the Gmail UI:

  1. Search label:spam -label:inbox
  2. Click "Select all"
  3. Click "Select all conversations that match this search"
  4. Click "Move to inbox"

aaronadamsCA avatar Jan 21 '22 10:01 aaronadamsCA

Each message in spam shows the same reason for being there:

Why is this message in spam? It is similar to messages that were identified as spam in the past.

So it doesn't seem like it would be a phishing filter thing (my old and new addresses are similar, which had me wondering).

Here is a cleaned-up version of the commands I used:

cd
bash <(curl -s -S -L https://git.io/gyb-install)

mkdir [email protected]
cd [email protected]/
~/bin/gyb/gyb --email [email protected] --action quota
~/bin/gyb/gyb --email [email protected] --action backup

mkdir [email protected]
cd [email protected]/
~/bin/gyb/gyb --email [email protected] --action create-project
~/bin/gyb/gyb --email [email protected] --action restore --local-folder ../[email protected]/[email protected]/ --label-restored "firstlast.ca"

The messages going to spam are decidedly the "spammier" ones, it's almost exclusively newsletters and notification emails; so it does seem like the spam filter is somehow processing each inbound message despite being asked not to.

aaronadamsCA avatar Jan 21 '22 10:01 aaronadamsCA

Can you provide sample messages or a sample backup that is showing this problem?

Let me know if any of the information above helps. If not, after my restore finishes running, I can try reproducing the problem with a small backup that I'd be comfortable sharing.

aaronadamsCA avatar Jan 21 '22 10:01 aaronadamsCA

Ha... unaddressed report from 2018 complete with repro:

https://issuetracker.google.com/issues/109956036

I added a comment (didn't mention gyb just in case they filter out issues that mention your GREAT project). I'm willing to bet this is unfixable on your end, since I can clearly see you're doing what you can.

aaronadamsCA avatar Jan 21 '22 11:01 aaronadamsCA

I'm seeing this issue as well, backing up a Workspace account and restoring to a free Gmail account.

I have a total of 53,001 messages in the backup, and on restore there was ~7,200 messages in the Spam folder.

My workaround was to move all of those messages in Spam back to Inbox (by selecting, 100 messages at a time and clicking the Not Spam button in the Gmail UI).

If you're going to do this, please ensure that you have 0 spam messages in the target Gmail account, otherwise you could end up moving genuine spam into your Inbox.

bvinnerd avatar Jan 24 '22 06:01 bvinnerd

I have this issue too 15000 msgs in spam. Mainly very old messages.

Also many seem gotten the date set to the restore time instead of the original date it was sent.

Most of the messages affected are from before 2000 but I also found one from 2003

flipflophhj avatar Jan 26 '22 09:01 flipflophhj

I just released GYB 1.55 which adds a --cleanup option on restore. This tells GYB to confirm the message has a valid From:, Message-ID: and Date: header on it before restoring. This should prevent the message from landing in Spam.

Can a few people do some testing and confirm it works for them? See the 1.55 release details for more info:

https://git.io/gyb-releases

jay0lee avatar Jan 26 '22 21:01 jay0lee

Hm.. I thought if I emptied the spam folder and then did a restore it would restore all those messages again but it doesn't seem so. What should I do ? Doing an estimate to see if that helps.

flipflophhj avatar Jan 26 '22 22:01 flipflophhj

You need to tell GYB to try restoring all messages again with --noresume.

Jay

On Wed, Jan 26, 2022, 5:26 PM Hans-Henrik Jensen @.***> wrote:

Hm.. I though if I emptied that spam folder and then did a restore it would restore all those messages again but it doesn't seem so. What should I do ?

— Reply to this email directly, view it on GitHub https://github.com/GAM-team/got-your-back/issues/342#issuecomment-1022660251, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABDIZMGDTP7LIKVKEJQEOPDUYBYINANCNFSM5MOEUPLQ . You are receiving this because you commented.Message ID: @.***>

jay0lee avatar Jan 26 '22 22:01 jay0lee

Hm it would be nice to be able to label the messages that were cleaned up though. I tried to use label-restored but it labels everything now.

flipflophhj avatar Jan 26 '22 22:01 flipflophhj

Traceback (most recent call last): File "gyb.py", line 2532, in File "gyb.py", line 2007, in main File "gyb.py", line 1769, in message_hygiene File "gyb.py", line 1713, in cleanup_from File "email\utils.py", line 215, in parseaddr File "email_parseaddr.py", line 513, in init File "email_parseaddr.py", line 256, in getaddrlist TypeError: object of type 'Header' has no len() [29748] Failed to execute script 'gyb' due to unhandled exception!

flipflophhj avatar Jan 27 '22 09:01 flipflophhj

Still got about 300 in spam of the 6000 restored before the exception

flipflophhj avatar Jan 27 '22 09:01 flipflophhj

I can no longer reproduce the issue with the sample from the issue tracker and --cleanup. Can you share examples of messages that went to Spam?

jay0lee avatar Jan 27 '22 14:01 jay0lee

I'd need to see the full headers as described at:

https://support.google.com/mail/answer/29436?hl=en

jay0lee avatar Jan 27 '22 14:01 jay0lee

Does it work to send the eml file ?

flipflophhj avatar Jan 27 '22 14:01 flipflophhj

Yes, that's fine. You can post it here or email it to me.

jay0lee avatar Jan 27 '22 14:01 jay0lee

Ok I sent an email.

flipflophhj avatar Jan 27 '22 15:01 flipflophhj

Oh by the way I saw that all the mails that had the now() date after restore seems to have a correct date in msg-db.sqlite so maybe that could be used for --cleanup

flipflophhj avatar Jan 27 '22 15:01 flipflophhj

I am in the same boat (185,000 email to transfer though). I was watching my Spam as the transfer was happening and saw some go in and then automatically go out of Spam. I was nervous the "older than 30 days will be deleted" thing was happening faster than I was moving them out of Spam.

I am redoing my restore but put this filter in place: image

I have not seen anything go to Spam. When the restore is done I'll turn off that filter.

I don't know enough about how quickly "older than 30 days" gets removed from Spam, and don't know if this is "the right thing to do" but it makes this data hoarder less nervous.

brechmos avatar Jan 31 '22 22:01 brechmos

Has anyone else tested with --cleanup to see if that helps?

On Mon, Jan 31, 2022, 5:56 PM brechmos @.***> wrote:

I am in the same boat (185,000 email to transfer though). I was watching my Spam as the transfer was happening and saw some go in and then automatically go out of Spam. I was nervous the "older than 30 days will be deleted" thing was happening faster than I was moving them out of Spam.

I am redoing my restore but put this filter in place: [image: image] https://user-images.githubusercontent.com/887675/151886623-b1de6315-e273-424e-bd08-ba16df4aefeb.png

I have not seen anything to to Spam. When the restore is done I'll turn off that filter.

I don't know enough about how quickly "older than 30 days" gets removed from Spam, and makes a data hoarder nervous.

— Reply to this email directly, view it on GitHub https://github.com/GAM-team/got-your-back/issues/342#issuecomment-1026293288, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABDIZMG2Q44TIVYGJUXROYTUY4HTNANCNFSM5MOEUPLQ . You are receiving this because you commented.Message ID: @.***>

jay0lee avatar Jan 31 '22 22:01 jay0lee

Did my EML files work fine for you?

man. 31. jan. 2022 23.59 skrev Jay Lee @.***>:

Has anyone else tested with --cleanup to see if that helps?

On Mon, Jan 31, 2022, 5:56 PM brechmos @.***> wrote:

I am in the same boat (185,000 email to transfer though). I was watching my Spam as the transfer was happening and saw some go in and then automatically go out of Spam. I was nervous the "older than 30 days will be deleted" thing was happening faster than I was moving them out of Spam.

I am redoing my restore but put this filter in place: [image: image] < https://user-images.githubusercontent.com/887675/151886623-b1de6315-e273-424e-bd08-ba16df4aefeb.png

I have not seen anything to to Spam. When the restore is done I'll turn off that filter.

I don't know enough about how quickly "older than 30 days" gets removed from Spam, and makes a data hoarder nervous.

— Reply to this email directly, view it on GitHub < https://github.com/GAM-team/got-your-back/issues/342#issuecomment-1026293288 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/ABDIZMG2Q44TIVYGJUXROYTUY4HTNANCNFSM5MOEUPLQ

. You are receiving this because you commented.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/GAM-team/got-your-back/issues/342#issuecomment-1026294593, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACJV5MVQON6764JT5HET3YLUY4H4BANCNFSM5MOEUPLQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you commented.Message ID: @.***>

flipflophhj avatar Feb 01 '22 05:02 flipflophhj

FWIW, I am also experiencing emails going into Spam (I have not yet tried --cleanup).

jhult avatar Mar 07 '22 14:03 jhult

Has anyone else tested with --cleanup to see if that helps?

Yes I did, and I can say: it doesn't work.

My numbers on restoration:

  • without --cleanup 200+ messages in SPAM out of 1500

  • with --cleanup 123 messages in SPAM out of 604

I was restoring different accounts so absolute numbers are different but you can easily calculate the percentage, it's nearly the same, with cleanup even worse.

Suncatcher avatar May 20 '22 16:05 Suncatcher

I have been doing an import moving 69k emails from a workspace account to a personal gmail account. I used --cleanup when doing the restore and it was still happening.

I have been running into this same issue.

The messages going to spam are decidedly the "spammier" ones, it's almost exclusively newsletters and notification emails; so it does seem like the spam filter is somehow processing each inbound message despite being asked not to.

This has been my experience as well. Lots of receipts, newsletters, etc.

I put in the same filter @brechmos did and that has helped eliminate messages going to spam. The downside to this is new emails are going to "All Mail" but I can live with this for not sending email to spam during the restore.

chrishoage avatar Jun 08 '22 16:06 chrishoage