brain-brew icon indicating copy to clipboard operation
brain-brew copied to clipboard

Windows EOL in HTML not supported

Open AFulgens opened this issue 2 years ago β€’ 7 comments

I am currently trying out a similar integration to Ultimate Geography with one of my decks. However, I am facing the issue that I am working on Windows and the HTML exports from CrowdAnki have CRLF EOL. Thus, I had to patch my brainbrew locally, namely had to change the regex https://github.com/ohare93/brain-brew/blob/1426351fd6c9e1ead68c44881db61e926f5a600a/brain_brew/representation/yaml/note_model_template.py#L26 to accept [\r\n]s instead of [\n]. This is actually making it be in sync with utils.filename_from_full_path and utils.folder_name_from_full_path.

I haven't found contribution guidelines to this repo, I could prepare a PR from a fork, if it would be wanted.

AFulgens avatar Oct 15 '23 06:10 AFulgens

Please do πŸ‘ I haven't done much testing on Windows myself, so thanks for fixing this πŸ˜„

ohare93 avatar Oct 15 '23 08:10 ohare93

Possibly related to #43? πŸ˜…

ohare93 avatar Oct 15 '23 08:10 ohare93

I haven't looked at #43 in a while, so I'm not 100% sure, but I think the PR itself is independent of the issue here (it's about replacing the old codecs.open with just open).

The discussion in "Aside on newlines" probably is vaguely relevant (but again not 100% sure) β€” I believe that in order to allow Windows and Linux users to inter-operate peacefully (without a back-and-forth war between \n and \r\n ) we will need to add newline='' to the open call.

Maybe, though, let's just cross this once we encounter this in the wild? (i.e. if I had a vote, I'd vote for just applying anything that @AFulgens needs, for now. :))

aplaice avatar Oct 15 '23 14:10 aplaice

You always have a vote, my friend 😁

ohare93 avatar Oct 16 '23 10:10 ohare93

I'll prepare a PR then.

On a side-note, I have noticed that there is also a problem with encoding, namely if I use Umlauts in fields name (ΓΌ comes to mind), then I will get a mixture of UTF-8 and cp1252 files. Is that also something Windows/Windows + Python related, i.e., on Linux non-ASCII should work fine? It's also something I could look into, but that sounds like a heavy change, because alls the reads/writes have to be fixed to UTF-8 instead of defaulting. Would that also be something interesting or should I just live with the fact that my fields should not contain high-ASCII/non-ASCII characters?

AFulgens avatar Oct 16 '23 10:10 AFulgens

Would that also be something interesting

IMO yes, definitely!

Is there cp1252 outside the YAML files? AFAICT all the other (i.e. except the two in yaml_object.py) relevant calls to open explicitly specify a UTF-8 encoding, so the CSV files, HTML files etc. should all just have UTF-8.

(The calls in setup.py and scripts/yamale_build.py also don't have an encoding specified, but that shouldn't affect the main operation of BrainBrew.)

aplaice avatar Oct 16 '23 21:10 aplaice

So far I have seen cp1252 only in YAML files generated with anki_to_sources recipes and problems with non-ASCII in sources_to_anki as well (i.e., mojibake in the resulting deck.json). I'll investigate and if it's not too many changes I will roll it together into a single PR with the regex fix.

AFulgens avatar Oct 17 '23 09:10 AFulgens