YahooGroups-Archiver icon indicating copy to clipboard operation
YahooGroups-Archiver copied to clipboard

Rewrite the archiver

Open daniel-j-born opened this issue 5 years ago • 4 comments

  • Upgrade to Python 3.
  • Reuse a requests.Session() object to reuse connections to increase archiving speed and avoid being spammed by Yahoo.
  • Pause between requests and exponentially backoff on errors to avoid being spammed by Yahoo.
  • Change user-agent to avoid being spammed by Yahoo.
  • Write messages to groupName/year/month/msgid.json instead of groupName/msgid.json.
  • Write to a tmp file and then rename into place to ensure no data corruption.
  • Create output directory in current directory rather than source code directory.
  • Change logging to Python logging.
  • move_to_year_month_dirs.py: New script to rename groupName/msgid.json to groupName/year/month/msgid.json.

daniel-j-born avatar Apr 29 '19 17:04 daniel-j-born