warcio icon indicating copy to clipboard operation
warcio copied to clipboard

fix utf-8 encoding

Open tomeksporczyk opened this issue 3 years ago • 2 comments

tomeksporczyk avatar May 31 '22 08:05 tomeksporczyk

Hmm, seems like this should stay as ascii, as it is UTF-8 encoded string but %-encoded value, which should always be ascii. Have you encountered an issue with this that this PR fixes?

ikreymer avatar Jun 01 '22 02:06 ikreymer

Hmm, seems like this should stay as ascii, as it is UTF-8 encoded string but %-encoded value, which should always be ascii. Have you encountered an issue with this that this PR fixes?

this actually solves the problem with converting old ARC to WARC (https://github.com/webrecorder/warcio/issues/140), I tested the resulting files via warcio index and warcio extract and got the correct files from payload

mw0000 avatar Jun 01 '22 10:06 mw0000