org-chef
org-chef copied to clipboard
Washington Post
The Washington Post often posts recipes in two different formats. It appears that org-chef chokes on one, but almost-but-not-quite manages the other.
First, the food
section: URLs from this section seem to fail with the error message "org-capture: Capture template ‘c’: Template is not a valid Org entry or tree
" (examples: Pecan Tassies, Misir Wot.)
(Side note: Is it possible to get a better error message that makes it clearer that the page couldn't be parsed? I've recently been mucking about in my .emacs and thought I'd somehow messed up the template or something there....)
Second: the recipes
section: URLs from this section (the same examples: Pecan Tassies, Misir Wot) seem to work....mostly. However, there are two small problems with the parsing:
- It cuts every separate sentence into a step, dropping the period along the way.
- It doubles the step number for each step.
(Just to confirm this isn't an issue across org-chef generally, I tried a recipe from AllRecipes and it was as expected, with multiple-sentence steps and only one numeral per step.)
For the first two examples (the "food" section urls), the problem is that the pages contain incomplete recipe json ld. The error could be better, but this can't really be fixed without a custom parser for the Washington Post recipe site.
For the second problem, it's a problem with the source material. The Washington Post site formats the json ld like that
The initial number could be removed pretty easily, I suppose, by stripping out initial numbers, but there's not much to do about breaking the recipe into too many steps.