specref
specref copied to clipboard
Spelling
This PR corrects misspellings identified by the check-spelling action.
The misspellings have been reported at https://github.com/jsoref/specref/commit/011ce84b8fa4b7a0847fc9d9b5c2183e3c9032d7#commitcomment-50050420
The action reports that the changes in this PR would make it happy: https://github.com/jsoref/specref/commit/57d13d6ddabda58a0321ca388186ac8972ef070b
Note: this PR does not include the action. If you're interested in running a spell check on every PR and push, that can be offered separately.
Obviously the easiest fixes are the stuff that's in your code (as opposed to the dataset).
I have no idea how your pipeline works, and I'd like to treat it as a black box. If someone can push fixes upstream so that they don't come back, that's best.
I'm still trying to release the version of my tool that I'm using (it's close, but I have two more things I want to fix, one hit in production overnight, and one which is just polish).
Hi there, any chances of breaking this down into a few different pull requests where we tackle what's not external references first?
Sure. (Sorry, I've been trying to release 0.0.18, which I half did, and then ran into bugs in dependabot which is so much fun.)
For reference, the above is just rebasing spelling onto spelling-code -- so, temporarily there may be more commits, but once spelling-code merges, there will be fewer.
This perl split the authors:
#!/usr/bin/env perl -pi
if ($a == 1) {
if (/^\s*\]/) {
$a = 0; next;
}
s/, Ed\./__ED/;
if (/^(\s+").*,.*"/) {
$lead = $1;
s/,\s*(.)/",\n$lead$1/g;
}
s/__ED/, Ed./;
} elsif (!$a && /"authors": \[/) {
$a = 1;
}
n.b. GitHub hates this commit.
@tobie / @marcoscaceres: would you want the split authors change standalone? (It comes with Eric Shepherd, because otherwise Shepherd would be split from Eric...)
or the internet archive: identityproject.lse.ac.uk/mary.pdf commit...?
Those are the only two that are easily splittable from this set. I mean, everything else could be split, but I don't think there's any particularly useful split beyond by file, and I doubt that helps much.
I'm ok with the size of this change. The changes are pretty straight forward.
@tobie?
I'm ok with the size of this change. The changes are pretty straight forward.
Most of these changes will get overwritten in the next 60 minutes by the auto update.
We have to be more intentional if this is to be useful, which is why I suggested starting with just spelling mistakes outside of the data itself.
Most of these changes will get overwritten in the next 60 minutes by the auto update.
hehe, that was next question.... "wait! isn't all this automated?"
We have to be more intentional if this is to be useful, which is why I suggested starting with just spelling mistakes outside of the data itself.
Agree.
Yeah, I was worried that most of the content was not primary source, which is why I wasn't particularly eager to do the author split.
Ok, so I guess we can salvage the changes to biblio.json and to legacy.json.
@jsoref so I'm still interested to merge changes to refs/biblio.json, refs/legacy.json, the readme. the docs, and other non-automated parts of the data set.
Is this something you would be willing to look into?
Sure. What do you want me to do?
thanks for extracting the PRs, @jsoref!
It's trivial for me to do so, I just need direction :-)
Is there a strategy for the other things? Are there upstreams we can poke? Are there tools we can improve?
Is there a strategy for the other things?
I guess if you can see if those documents are on GitHub, then filing issues on the misspelled specs will get them fixed there.
Are there upstreams we can poke?
As above... some are it's going to be difficult :( like the IETF ones.
Are there tools we can improve?
Yes... this could run exclusively when a PR is made to refs/biblio.js, for instance. At least, we could check those.
Yes... this could run exclusively when a PR is made to refs/biblio.js, for instance. At least, we could check those.
I'd be quite happy to offer the action. It's trivial to exclude files (e.g. the three files remaining in this PR) using excludes.txt, or to only check files using only.txt.
I think (thought please verify!) you can exclude also via the action syntax itself: https://docs.github.com/en/actions/reference/workflow-syntax-for-github-actions#excluding-paths
That doesn't do what you imagine. It controls when an action runs, not what it does. And for more fun, it doesn't provide that content to the workflow, so you can't actually use it to do work, you have to re-engineer it :-o
Fun! thanks @jsoref. I'll take a look at the corresponding PR soon.