osu-wiki
osu-wiki copied to clipboard
Fix markdown output of remark and run remark over existing content
2022 edit(by clayton): this issue was originally tracking something more like #2275, but is now tracking the status of remark-stringify producing correct output for osu-wiki. see my comments in this thread but leads on improving it, but they may be outdated. use #2275 to track any existing formatting issues or wanted CI checks.
There have been recent PRs attempting to fix markdown inconsistencies, mixed in with other manual changes that ends up being an absolute mess to review and just isn't how commits should be done.
I've added codacy which uses remarkjs for linting. It seems like a good starting point.
Next steps would be for contributors to check through the reported issues and help tweak the inspections / configuration file to match our expectation. For instance, we will need to disable the naked URL rule or see if we can train it to ignore the header content.
Once a configuration is decided on, it should be run on all articles in a single commit, then enforced via CI on pull requests (already enabled but not required for merging just yet).
Next steps would be for contributors to check through the reported issues and help tweak the inspections / configuration file to match our expectation.
Will do.
Are we currently using the markdown-style-guide
preset for remark-lint? Also could we see the configuration file, would have in making further changes.
You can just edit the settings in codacy, I think.
remarkjs/codacy has started to become a pain due to presets being broken. From my investigation it seems that the overrides of the present-lint-markdown-style-guide
preset just plainly don't work at all, likely due to remarkjs/remark-lint#165.
For example, when I added a different line length limit of 40 to the news subconfig, codacy would start to randomly report 40 on some lines and 80 on others (80 being the limit from the preset itself). Locally where I installed rules with npm install -g
(which is apparently what shouldn't be done) I got similar behaviour where I'd get both 40 and 80 reported on some lines. I figure codacy just randomly clips one of them.
I can't and aggressively don't want to try to see if I can go and fix remarkjs, so instead I had a go at setting up markdownlint + github actions on a fork (not without accidentally screwing up and opening a PR to upstream instead of my own fork, gotta love it). I got it to a point where it's working - as can be seen here - but the workflow definition is... somewhat arcane and not as maintainable perhaps? The UX is kind of a downgrade too - a plain text list of lines and errors is less friendly than codacy's nice file view.
The possible upside to actions is however that I might just be able to get markdownlint-cli's -f
flag that auto-fixes basic errors going that way - there are already other github actions out that allow for automatic commits inside of workflow runs. I figured I'd leave this wall o' text first and wait for feedback before going down this path though.
(Oh, and another nice (?) thing is that due to workflows being built in everyone will automatically get CI on their own forks.)
Future goals if I continue with this:
- auto-fixes with
-f
- early-exit run if no markdown files were touched (so don't even touch
npm
) - partial clone & sparse checkout of tree (to avoid heavy fetches of image assets)
is the issue only with the news config?
It's not, it also happens with wiki articles (example - see 80 char warnings in there). As far as I can tell (which is not far as codacy is a black box) it's the same as when I wrongly globally installed remark and its linter rules, and the actual issue boils down to the base preset not being affected by the overrides that come after it.
As a side note I also have a working github actions setup with a proper remark installation and I'm leaning towards that instead as the output is a little bit nicer. Auto-fixes are a no-go unfortunately due to implementation foibles (workflows running on PRs from forks have read-only perms to avoid leaking secrets).
Seems like a feasible solution. Have you looked into getting inline output? I've seen other workflows which can add the errors inline in the source to make it easier to understand.
From what I've seen so far remark will print the files analysed to stdout, but unfortunately it seems not in the way that would be useful here (just spews file contents first and then the linter warnings). I'll poke around some more to see if having that is achievable.
bdach's ci is working well enough to drop codacy @peppy , codacy's just causing confusion with newer contributors
new todos, i'd say only the first two are important for now:
- [ ] use github to make check comments for errors
- [ ] only fail when new errors come from remark, instead of when any errors are in changed files
- [ ] make remark-lint plugins for rules we have in ASC that aren't checked yet
- [ ] see if auto-fixing would be easily possible for individual rules, and if not, make remark-stringify plugins to get the overall output right
- [ ] (if we need rules that check images) see about alternatives to sparse checkout that let us get basic metadata for blobs
the checking-for-new-errors is going to require use of github actions assets, i think (maybe we should be publishing the analysis results for each master build and then diffing).
i'll turn off codacy now
I think I can make the remarkjs plugins for everyone, I also suggest we not only limit ourselves in the confines of GitHub, but also take this linting prowess to contributor's desktop so they catch the problem early on before publishing.
that's how it's been already, npx remark --no-stdout [files/folders...]
For reporting only new errors it seems remark has a thing called vfile - if we persisted that (just the messages, presumably) then we could exclude the pre-existing errors. It's probably easier said than done though.
I'm willing to help with development of custom tools for this, even though it's js.
Starting out development of this in a repo, will disclose details in a few.
Going to extend more of the original PR with local linting powers ( #3533 ). It seems its a very bad idea to run remark on news/
so we should skip that instead.
news should be handled with a local config override as codacy was setup to do, if possible.
Let me see what I can do in the Remark side.
see if auto-fixing would be easily possible for individual rules, and if not, make remark-stringify plugins to get the overall output right
it looks like the answer's gonna be no to the first part, so here's more looking into stringify after #3775
- [ ] there are a few fake-lists throwing off the parser, with formats like
1\. item
or[1] item
(should be made into actual lists) - [ ] stringify prefers to use
'
surrounding link titles if they contain double quotes. rn our lint rules say to always use double quotes, and escape them if they're in the title - [x] headers missing a space like
#Title
are treated as a paragraph, so that ends up as\#Title
. space should be added - [ ] a bunch of hyphens are escaped by stringify for no reason
- [x] brought up an unrelated problem where some news posts are using a hyphen instead of em dash for author
- [ ] matched closing brackets don't always get escapes from stringify, we probably want that but it doesn't break anything to have them removed
- [ ] lines that end paragraphs and also have a hard break should have the hard break removed (mostly in tumblr news)
- [ ] table alignment rows are outputted with
:-
for left and-:
for right, where we use:--
and--:
. I think we should just change how we do it though, need lint rules to comply
should be pretty close to done after that.
I found a VS Code extension that integrates with .remarkrc.js
but not quite well. I'll see if I can redo that so it actually honors our configuration.
skimmed through the issue, and it seems that it can be closed (especially because we can run the check now locally). also, I have an impression that most of the style issues with existing articles were fixed when the articles were touched (in order to get them merged).
re-opening this one to track the status of remark-stringify matching as best as possible with remark-lint, I don't think that was ever finished...