cryptics
cryptics copied to clipboard
Data model: split clues?
Hello,
There's a type of clue that the data model doesn't appear to handle. Where the clue is spllit over two (or more) runs there are multiple rows. So there isn't a one to one relationship between clue_number and clue (clue as opposed to clue).
I'm not quite sure how best to handle this — and whether it should be handled here or downstream (though I don't have a downstream, yet). It might be good for the parser to merge the relevant entries: using the clue_number and a cleaned up clue from the first clue, and merging with the ANSWER from the second.
In the case I found, the clue for the second row is essentially a pointer to the first clue; the answer in the first clue is only part of the actual answer.
Here's an example: data: clues: 2 rows where rowid in ["607962", "607964"] sorted by puzzle_date descending.
JSON representation of a split clue (with the answers!)
{
"607962": {
"rowid": 607962,
"clue": "& 2 Toxic chemical agent that’s a threat to the world (7,6)",
"answer": "CLIMATE",
"definition": "a threat to the world",
"clue_number": "27a",
"puzzle_date": "2021-12-06",
"puzzle_name": "Cyclops 716 Caricature Premiership",
"source_url": "https://www.fifteensquared.net/2021/12/06/cyclops-716-caricature-premiership/",
"source": "fifteensquared"
},
"607964": {
"rowid": 607964,
"clue": "See 27ac. (6)",
"answer": "CHANGE",
"definition": "see 27A",
"clue_number": "2d",
"puzzle_date": "2021-12-06",
"puzzle_name": "Cyclops 716 Caricature Premiership",
"source_url": "https://www.fifteensquared.net/2021/12/06/cyclops-716-caricature-premiership/",
"source": "fifteensquared"
}
}
In addition to doing nothing, there are a few options, including the parser change above. I'd like the clue to correspond to answer rather than reflect what the crossword lists as a clue… but I'm sure there are trade-offs, of which I'm not aware.
In any case, this is a great resource you've curated!