Refactor creation of puz files and add optional PUZv2 support
Most of the downloaders follow a similar pattern of creating a puzzle file with puz.Puzzle() in their parse_xword method and returning this puzzle file to the controller. This refactor changes this to make puzzle files attributes of the BaseDownloader class, which the downloaders then use as needed.
This has several benefits:
-
The
Puzzleinterface is now abstract, from the point of view of the downloaders. None of them importspuzany more. This enables a potential future PR to allow swapping outpuzfor a compatible implementation, e.g. one that would create files in a different format like .ipuz (libipuz). -
As part of this PR, we add a flag
-1--v1that creates PUZ v1.0 files, and add (and make default) support for the 2.0 version. Because PUZv2 supports UTF-8, we don't have to strip out e.g. emojis, which are becoming increasingly common in online crosswords. I've tested the results and Gnome Crosswords can show emoji puzzles out of the box after this PR.
There are a couple of things that need looking at, e.g. I don't really understand what's up with the rebus table encoding or whether that is impacted by this PR.
There are at least two other ways you could do this refactor. If you'd prefer one of these, I'm happy to redo it that way.
- Have the controller make a
Puzzleand send it toparse_xwordrather than having it as a long-lived attribute ofBaseDownloader. - Make our own abstract
Puzzleclass in a separate file, which could import and usepuz.Puzzleor some other library (possibly, in the future). Individual downloaders would import from this file rather thanpuzdirectly, but would continue to use the resultingPuzzleobject as they do now.
I've privately rebased this PR to apply cleanly on top of https://github.com/thisisparker/xword-dl/pull/205 and I think that's the order it makes sense to go in, if you're going to consider both.
Okay, I've picked my changes on top of main and force pushed, so this should be ready to go if you're interested in it @thisisparker. This is huge for me in terms of getting working emoij out of the box in solvers like Gnome Crosswords.
I acknowledge there might be concerns about how other solvers would respond to making PUZv2 the default. For all I know, there are breaking changes in PUZv2 that some solvers haven't adapted to. However, I don't think we need to worry about that in this case, because the puzpy library appears to use no features of PUZv2 other than saving the files as UTF-8 and changing the PUZ version number in the file. The only way this could break a solver is if one actively refuses to open v2 files, or can't handle UTF-8, and both seem pretty unlikely.
Still, it's worth testing with whatever solver you use. If necessary, this can be changed to make v1 the default.
Edit: there might still be issues with rebus table encoding? I haven't looked into this since the comment above.
I'm going to try to resolve the conflicts on this branch and hopefully we can merge this. Note to self: check if a new version of puzpy has been released, because there's now a method to set the puzzle version. https://github.com/alexdej/puzpy/commit/0c85baeb4be4be25668f1df9b60144efbba7641b
Amazing. Let me know if you want me to take a crack at resolving the conflicts (but I think you may be better situated). I just got a notification on another puzpy issue that a new release is forthcoming, which is exciting!