markovify icon indicating copy to clipboard operation
markovify copied to clipboard

missing utf-8 BOM lead to codec failures during tests on windows

Open aogier opened this issue 1 year ago • 0 comments

FYI, as #174 stimulated my curiosity, one of the two test files lacks BOM bytes:

$ hd  -n4 test/texts/sherlock.txt
00000000  ef bb bf 50                                       |...P|
00000004
$ hd  -n4 test/texts/senate-bills.txt
00000000  32 31 73 74                                       |21st|
00000004

and this seems not to please windows' machines (unices will probably have a better autodetection thing). I don't see this as a problem per se, as leaving correct input mangling is probably a user's task but it lowers x-files factor on previous commits ;)

thank you, regards

aogier avatar Dec 12 '22 20:12 aogier