xword-dl icon indicating copy to clipboard operation
xword-dl copied to clipboard

Refactor creation of puz files and add optional PUZv2 support

Open afontenot opened this issue 1 year ago • 5 comments

Most of the downloaders follow a similar pattern of creating a puzzle file with puz.Puzzle() in their parse_xword method and returning this puzzle file to the controller. This refactor changes this to make puzzle files attributes of the BaseDownloader class, which the downloaders then use as needed.

This has several benefits:

  • The Puzzle interface is now abstract, from the point of view of the downloaders. None of them imports puz any more. This enables a potential future PR to allow swapping out puz for a compatible implementation, e.g. one that would create files in a different format like .ipuz (libipuz).

  • As part of this PR, we add a flag -1 --v1 that creates PUZ v1.0 files, and add (and make default) support for the 2.0 version. Because PUZv2 supports UTF-8, we don't have to strip out e.g. emojis, which are becoming increasingly common in online crosswords. I've tested the results and Gnome Crosswords can show emoji puzzles out of the box after this PR.

afontenot avatar Aug 06 '24 18:08 afontenot

There are a couple of things that need looking at, e.g. I don't really understand what's up with the rebus table encoding or whether that is impacted by this PR.

There are at least two other ways you could do this refactor. If you'd prefer one of these, I'm happy to redo it that way.

  • Have the controller make a Puzzle and send it to parse_xword rather than having it as a long-lived attribute of BaseDownloader.
  • Make our own abstract Puzzle class in a separate file, which could import and use puz.Puzzle or some other library (possibly, in the future). Individual downloaders would import from this file rather than puz directly, but would continue to use the resulting Puzzle object as they do now.

afontenot avatar Aug 06 '24 18:08 afontenot

I've privately rebased this PR to apply cleanly on top of https://github.com/thisisparker/xword-dl/pull/205 and I think that's the order it makes sense to go in, if you're going to consider both.

afontenot avatar Apr 03 '25 23:04 afontenot

Okay, I've picked my changes on top of main and force pushed, so this should be ready to go if you're interested in it @thisisparker. This is huge for me in terms of getting working emoij out of the box in solvers like Gnome Crosswords.

I acknowledge there might be concerns about how other solvers would respond to making PUZv2 the default. For all I know, there are breaking changes in PUZv2 that some solvers haven't adapted to. However, I don't think we need to worry about that in this case, because the puzpy library appears to use no features of PUZv2 other than saving the files as UTF-8 and changing the PUZ version number in the file. The only way this could break a solver is if one actively refuses to open v2 files, or can't handle UTF-8, and both seem pretty unlikely.

Still, it's worth testing with whatever solver you use. If necessary, this can be changed to make v1 the default.

Edit: there might still be issues with rebus table encoding? I haven't looked into this since the comment above.

afontenot avatar May 31 '25 04:05 afontenot

I'm going to try to resolve the conflicts on this branch and hopefully we can merge this. Note to self: check if a new version of puzpy has been released, because there's now a method to set the puzzle version. https://github.com/alexdej/puzpy/commit/0c85baeb4be4be25668f1df9b60144efbba7641b

afontenot avatar Sep 18 '25 19:09 afontenot

Amazing. Let me know if you want me to take a crack at resolving the conflicts (but I think you may be better situated). I just got a notification on another puzpy issue that a new release is forthcoming, which is exciting!

thisisparker avatar Sep 22 '25 17:09 thisisparker