browsertrix-crawler icon indicating copy to clipboard operation
browsertrix-crawler copied to clipboard

save state: export pending list as array of json strings + fix importing save state to support pending

Open ikreymer opened this issue 1 year ago • 1 comments

The save state export accidentally exported the pending data as an object, instead of a list of JSON strings, as it is stored in Redis, while import was expecting list of json strings. The getPendingList() function parses the json, but then was re-encoding it for writeStats(). This was likely a mistake. This PR fixes things:

  • support loading pending state as both array of objects and array of json strings for backwards compatibility
  • save state as array of json strings
  • remove json decoding and encoding in getPendingList() and writeStats()

Fixes #568

ikreymer avatar May 17 '24 00:05 ikreymer

~~I think this is a problem for queued and failed as well?~~

Nevermind, I see those sections actually expect serialized JSON objects?

edsu avatar May 19 '24 17:05 edsu

~I think this is a problem for queued and failed as well?~

Nevermind, I see those sections actually expect serialized JSON objects?

I believe these both should be json strings, similar to pending. The failed gets JSON.parse()d prior to being loaded back into Redis. In my testing this is working as expected.

tw4l avatar May 21 '24 15:05 tw4l