Alex Cabal
Alex Cabal
No rush, take your time!
OK, I think we can do this; but we will of course have to update the corpus. That might be tricky... we can assume any string of 4 digits is...
I think it's consistent with the project goals so it's worthwhile. Updating the corpus will be work. I also wonder if we can add a lint check for this that...
Where are we at with this proposal?
Untested pseudo-xpath: `/html/body//p/text()[re:test(., '\b[0-9]{3,4}\s$') and following-sibling::*[0][name() = 'abbr' and re:test(., '^BCE?$')]]` This should match paragraphs containing BCE years, so invert that or something for the lint check so that it...
Also, we should use xpath for the lint check and not a regex regardless, because currently the regex will emit a lint error if there is for example `` in...
Can you try getting an xpath working? That will let us output a specific line that is the problem. Also, there is a draft PR that will add line numbers...
Great, thanks! Now the big question is, how do we update the corpus? Can you take care of that?
Thinking about this further, we have another big update in the next version of the toolset where we remove `url:` from the SE identifier. Instead of rebuilding the corpus twice...
OK prefect, thanks. I think I've updated everything on my local copy. Don't push any more changes just now. I'm waiting on a merge conflict resolution for a different PR...