build-loi tests
I had a few minutes today, so I started looking at this. A few questions:
- Running build-loi on the corpus yields a different loi in over half the cases. I read the PR comments, but there was never any resolution to all of these differences, most of which are not block level issues, e.g. What are we doing about these differences? Again, I'm not talking about differences where there is block level text; I'm talking about differences in the text itself.
- For @apasel422: The second test has chapter files in the golden files that aren't in the input files. That's obviously not correct, but I don't know how it got in that state or what the correct input files are, etc. What should that test look like?
- In short, what is supposed to be the difference between test-1 and test-2?
It occurred to me right after I sent this that I hadn't updated my copy of the corpus, so I did so, and the differences dropped from 45 to 39. But the question above still stands for the remaining differences.
2. For @apasel422: The second test has chapter files in the golden files that aren't in the input files. That's obviously not correct, but I don't know how it got in that state or what the correct input files are, etc. What should that test look like
Sorry about that; without the test infra accounting for extraneous it was easy for these to get added in mistakenly. The input chapter files should remain unchanged in the output, and there should simply be a new loi.xhtml file present in the expected output.
3. In short, what is supposed to be the difference between test-1 and test-2?
One test is supposed to cover generation of a completely new loi.xhtml file; the other is supposed to update an existing one in place.
No worries, that's what I needed. Thanks!
@acabal, in addition to question 1 above, I found a dozen or so books that have one or more figures with no id. Based on 7.8.1, I assume that is incorrect (they're all figures, not inline images). Do you want PR's to fix them?
Yes please, all figures should have IDs. This can be an easy lint check as well.
Now that I think about it further, I'm going to revise that rule a little to only require that <figure> has an id. <img> may appear in inline text and typically those are not interesting enough to be addressable via URL. But all <figure>s should be. I'm working on a lint test now.
Is this still in progress or should we close it?
Sorry, you can close it. I updated everything at the time, I believe; I just checked again, and didn't find any figures without ids.