dominiontabs icon indicating copy to clipboard operation
dominiontabs copied to clipboard

[draft] Use card text from the dominion strategy wiki

Open mbbush opened this issue 6 months ago • 1 comments

This is not ready to merge yet, but I wanted to show what I'm working on and get some feedback.

There's a lot going on here.

  • I originally hacked together a script that would read the card text from the dominion strategy wiki list of cards, and convert it into the format expected by this project (e.g changing {{Cost|3}} to 3 Coins).
  • After evaluating its results when rendering all the cards, I opted to instead keep the wiki formatting, and write new code for handling the inline images. The main reasons for this were:
    • The wiki formatting doesn't cause extra replacements to happen unintentionally. For example, the card text of Alchemist says "At the start of Clean-up this turn, if you have a Potion in play, you may put this onto your deck.". Currently when rendering that, this project substitutes the Potion symbol, even though it's not referring to the Potion resource, but the card named Potion.
    • The wiki formatting has more options for indicating size
    • The wiki formatting allows better control over line breaks, although actually making that work with reportlab is a challenge.
  • I wanted to keep the existing text for now, so I could render both side by side (with --front rules --back wiki-text) to compare them.
  • I added an option to render the actual card images (which I scraped from the wiki), so I could compare the way this code rendered the text to the way the actual cards do. It was very close; this is what led me to adjust the horizontal margin and font size to get the wrapping on certain cards like Mine to match what the card does when the orientation is vertical. I left that code out of this PR for now (except for adding it to text_coices), but I could include it if @sumpfork or @nickv2002 would find it helpful.
  • While I was re-implementing inline images, I learned more about various options in reportlab, and added more parameters relating to vertical scaling to get images to line up with the surrounding text. I also added the leading and autoLeading parameters to make some vertical room around the images as needed.
  • I found the mixture of percentages and absolute points in the current size code confusing. I settled on consistently using a factor to multiply by the font size.
  • I used a more object-oriented paradigm with type hints in inline_images.py because I found it easier to understand/use.

A few things that still aren't done:

  • Handling cards that are grouped, like the Knights, or the Augurs, or Plunder/Encampment. The wiki has a page for each individual card, and I think it makes sense to have an entry in the card db for each individual card, along with one for the group, as we currently have in most (but not all) situations (we're missing Knights, Castles and Prophecies, among others). The question that remains is what text to put on the group divider. I can think of three options:
    • Generate it dynamically from the text of each card in the group. This could work well as a follow-up to the refactor work I did in https://github.com/sumpfork/dominiontabs/pull/567, but it seems like it would probably result in a lot of duplication, especially for the rules text.
    • Retain hand-written and manually-maintained text for the group cards.
    • Write some sort of script that merges the individual card text into the group card's text.
  • Scraping card text in additional languages. This will require loading each individual card's page and parsing the wiki text. Downloading all the pages is straightforward; I've already got a script that does that with a reasonable rate limit. I'm not sure how challenging it will be to scrape the wiki text; it will depend on how consistently formatted they are.
  • I don't think this is the form that I want to merge this in. I'd rather actually update the description in the card db rather than relying on a random external file, this was just easier to manage git conflicts for now.
  • I'd love to also pull down the FAQ section from the wiki for the "Rules", but that's going to be another whole big project. Definitely not in the same PR.

Note regarding the <nobr> tag: the reportlab docs say this will prevent line breaks, but it doesn't actually work. I patched the library code to fix it and plan on submitting a patch to them. In the meantime, the tag is simply parsed and harmlessly ignored using the published versions of the reportlab library. Take a look at how Sailor is rendered with vertical orientation if you want an example.

mbbush avatar May 21 '25 21:05 mbbush

I haven't looked in detail, but I'm very happy to replace the current language used to encode things like inline images - it grew organically out of a single initial substitution and has never been revisited. Been bugging me for a while.

sumpfork avatar May 21 '25 21:05 sumpfork