windows-1257 not fix
import ftfy print(ftfy.fix_text('SÄ…raÅai'))
This is true -- windows-1257 isn't currently on the list of encodings that gets fixed.
Would you be able to point me to somewhere that I'd find text files that were really encoded in windows-1257, or more examples of windows-1257 mojibake in the wild, so I could make heuristics and test cases out of it?
TXT files here (i.e. files prior to 2004) are windows-1257 encoded.
http://zagarins.net/kjl/arhivs.html
Wonderful, this looks like something I can make a heuristic out of for the next version.
I've got a version that can fix text like "Šveices baņķieri gaida konkrētus investīciju projektus", but what was the example you originally gave supposed to become? It's got a Unicode private use character in it.
Released in 6.3.