superset icon indicating copy to clipboard operation
superset copied to clipboard

docs: Check markdown files for bad links using linkinator

Open rusackas opened this issue 1 year ago • 13 comments

SUMMARY

Adding a little CI magic to test markdown files for problem links using Linkinator. Hopefully this remedies some of the issues stemming from moving pages around.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

  • [ ] Has associated issue:
  • [ ] Required feature flags:
  • [ ] Changes UI
  • [ ] Includes DB Migration (follow approval process in SIP-59)
    • [ ] Migration is atomic, supports rollback & is backwards-compatible
    • [ ] Confirm DB migration upgrade and downgrade tested
    • [ ] Runtime estimates and downtime expectations provided
  • [ ] Introduces new feature or API
  • [ ] Removes existing feature or API

rusackas avatar May 10 '24 04:05 rusackas

@mistercrunch seems like the action isn't running :/

I'll investigate in the morning unless you see anything plainly obvious.

rusackas avatar May 10 '24 05:05 rusackas

Screenshot 2024-05-09 at 10 10 47 PM

visible here -> https://github.com/apache/superset/actions/runs/9027384874

mistercrunch avatar May 10 '24 05:05 mistercrunch

https://github.com/marketplace/actions/markdown-link-check

?

mistercrunch avatar May 10 '24 05:05 mistercrunch

I'll open another one with that other action. This one looked simpler when I opened it, but... I guess it's not so simple!

I opened an Infra ticket to see if there's a preferred practice for cases like this: https://issues.apache.org/jira/issues/?filter=-2

rusackas avatar May 10 '24 14:05 rusackas

Actually, that other Action is also on a 3rd party repo, so that won't be any easier.

rusackas avatar May 10 '24 14:05 rusackas

It'd be good to know where the full complex rule lives behind this truncated one -> Screenshot 2024-05-10 at 12 43 14 PM

mistercrunch avatar May 10 '24 19:05 mistercrunch

Actually the truncation in the screenshot above cuts right before fully pointing to https://github.com/gaurav-nelson/github-action-markdown-link-check

mistercrunch avatar May 10 '24 19:05 mistercrunch

Ok, opened a PR using the other action here: https://github.com/apache/superset/pull/28437

This other action, however, is in maintenance mode, pending some replacement with the author. It also doesn't support multiple file extensions in its config, so we have to run it twice. That's a bit silly, but hopefully it'll work.

rusackas avatar May 10 '24 22:05 rusackas

Linkinator has now been added to the ASF allow list... let's see if this is simpler/faster than the other PR :)

rusackas avatar May 13 '24 22:05 rusackas

Thanks for the approval, but this is only the tip of the iceberg :)

This found (if memory serves) something like 34 links that need to be addressed. These will need updates and/or redirects.

It does not find other ones that Google thinks exist, such as one of my perennial google searches leading to this: https://superset.apache.org/docs/contributing/hooks-and-linting/

I'm not sure (yet) if adding frontend (docusaurus) redirects will update search engines, or if .htaccess changes are needed. In either case, this (and others) are strange cases, since they no longer have a direct page equivalent, and will redirect to anchor tags.

Le sigh... much more to do here before this is over.

rusackas avatar May 14 '24 16:05 rusackas

I mean... it seems like it'll let us merge with the test failing as-is, but that seems like bad hygiene for other PRs that follow before I clean up my mess ;)

rusackas avatar May 14 '24 16:05 rusackas

Oh yes, we should sort out the stuff. htaccess fix is easy, lots of precedents -> https://github.com/apache/superset/blob/master/docs/static/.htaccess#L67

But better to have no links than broken links, could just wipe out the ones where finding new link isn't obvious

mistercrunch avatar May 15 '24 02:05 mistercrunch

It does not find other ones that Google thinks exist

yes, it can't know about links we used to have, for that I did a pass on a few pages worth of google results, but may be good to go deeper. Prevention could be good too, I think there might be ways to detect that links get delete, maybe some sort of auto-generated list of all links that docusaurus serves. We can resolve in a future PR

it seems like it'll let us merge with the test failing as-is

yes, could make this a required check if you want https://github.com/apache/superset/blob/master/.asf.yaml#L58, but if/when we do this on conditional ones we need to use the ./.github/actions/change-detector/

mistercrunch avatar May 15 '24 02:05 mistercrunch

Sorry for all the noise. A rebase went quite goofy...

I'll start chipping away at more of the failing links.

rusackas avatar Jul 29 '24 20:07 rusackas