ethereum-org-website icon indicating copy to clipboard operation
ethereum-org-website copied to clipboard

Hook to check Language Attribute in a Markdown file

Open nikkhielseath opened this issue 3 years ago • 12 comments

Description

Create a script to verify whether a markdown file part of the src/content folder contains a non-empty lang attribute as a part of its front matter.

I have used git diff with some regex to see that the Added markdown file changes contains lang.

Caveats

  • This hook will work only for newly created markdown files
  • This hook will perform a rudimentary lang attribute check based on a regex match image

image

Testing

You will need to create some markdown files at random places to test out the same.

  1. Create a markdown file with an invalid or without lang attribute at a folder other than the src/content
  2. Stage the file
  3. Run npm run-script check-markdown-language-tag

It will not be detected by the script and hence, the hook will proceed.

  • Repeat the above steps but, this time move the markdown file to a place under src/content

The script should throw an error

Related Issue

  • related to #4253

nikkhielseath avatar Jul 02 '22 11:07 nikkhielseath

Gatsby Cloud Build Report

ethereum-org-website-dev

:tada: Your build was successful! See the Deploy preview here.

Build Details

View the build logs here.

:clock1: Build time: 35m

Performance

Lighthouse report

Metric Score
Performance :large_orange_diamond: 19
Accessibility :large_orange_diamond: 89
Best Practices :green_heart: 100
SEO :green_heart: 92

:link: View full report

gatsby-cloud[bot] avatar Jul 02 '22 11:07 gatsby-cloud[bot]

@minimalsm

Thank you for assigning this issue. It was fun to learn about Node and create a script. :)

nikkhielseath avatar Jul 02 '22 12:07 nikkhielseath

Hey @SNikhill, thanks again for working on this.

On testing, I am unable to get the hook to throw an error. I tried adding random characters (see screenshot below) and also completely deleting the lang prop from the markdown (in src/content/bridges/index.md) but the checks run green. Am I using it as intended or is there a problem?

Screenshot 2022-07-11 at 14 47 10

minimalsm avatar Jul 11 '22 13:07 minimalsm

Awesome, thanks for this PR!

This makes me think I wonder what we can generalize here in terms of validating markdown files. For instance, every markdown page should (at least) have:

  • title
  • description
  • lang

I also wonder if we could use our schema to help validate this. Curious what you think @pettinarip?

samajammin avatar Jul 11 '22 17:07 samajammin

@minimalsm Thank you for catching that bug with the numbers. The Regex has been fixed to capture just letters.

Make sure that you stage the markdown file after you make changes to it.

nikkhielseath avatar Jul 11 '22 18:07 nikkhielseath

@samajammin

That is a good suggestion and I believe it can be done but, I am not sure if git diff would be the right approach for it.

nikkhielseath avatar Jul 11 '22 18:07 nikkhielseath

@minimalsm

You are right that there is a caveat that if lang for an existing file is removed, no error is thrown.

diff with -G seems to match even if a line with a matching regex pattern was removed (which is basically what is happening in the test case).

I shall create a fix and let you know.

nikkhielseath avatar Jul 11 '22 18:07 nikkhielseath

Hey @SNikhill thanks for the PR!

I had a similar experience where the script is not throwing any error when I remove the lang prop and/or set something like lang: asdasdasd.

For now, I think this is a practical solution to the problem.


However, I think a more robust one would be to dig into using a linter, like ESlint. Would be ideal to have a custom rule which checks for us all these properties.

  • We could enable Gatbsy + ESlint: https://www.gatsbyjs.com/docs/how-to/custom-configuration/eslint/
  • Use something like this -> eslint plugin for md: https://github.com/mdx-js/eslint-mdx/tree/master/packages/eslint-plugin-mdx
  • And figure out how to set a custom rule to validate frontmatter properties.

pettinarip avatar Jul 11 '22 19:07 pettinarip

Hey @SNikhill thanks for the PR!

I had a similar experience where the script is not throwing any error when I remove the lang prop and/or set something like lang: asdasdasd.

For now, I think this is a practical solution to the problem.

However, I think a more robust one would be to dig into using a linter, like ESlint. Would be ideal to have a custom rule which checks for us all these properties.

* We could enable Gatbsy + ESlint: https://www.gatsbyjs.com/docs/how-to/custom-configuration/eslint/

* Use something like this -> eslint plugin for md: https://github.com/mdx-js/eslint-mdx/tree/master/packages/eslint-plugin-mdx

* And figure out how to set a custom rule to validate frontmatter properties.

I agree, that a custom lint rule would definitely be a more robust solution.

I shall update you as I improve the hook and possibly create the rule.

nikkhielseath avatar Jul 16 '22 10:07 nikkhielseath

@pettinarip and @minimalsm

I wanted to try out something and have modified the hook.

  • It will now only consider newly created markdown files.

My previous test scenario was flawed. I failed to consider what would happen if the git diff command produced no output.

Now the hooks will check for the empty output case and list the progress and the files found.


Not the best hook, but fine for a rudimentary lang attribute check.

NB: Let me know if I can work on creating a robust markdown checker using the provided approach or feel free to suggest a package for parsing the frontmatter from a markdown file.

nikkhielseath avatar Jul 16 '22 11:07 nikkhielseath

As discussed @SNikhill, marking this as blocked until #7088 is implemented to see if we can use one of those functions as an export to run a commit hook here.

minimalsm avatar Aug 01 '22 17:08 minimalsm

Understood @minimalsm

I shall keep an eye on the other PR and I shall make the changes as it is reviewed and merged. :)

nikkhielseath avatar Aug 05 '22 19:08 nikkhielseath

As discussed, we no longer really have this issue due to the markdown checker being run automatically on crowdin-import. Closing this out for now but I appreciate your work here as it massively informed the changes we made to improve the productivity of this workflow.

minimalsm avatar Sep 05 '22 12:09 minimalsm