evaluate icon indicating copy to clipboard operation
evaluate copied to clipboard

Yaml verification for module READMEs

Open lvwerra opened this issue 3 years ago • 3 comments

The Hub performs YAML verification when a repository is pushed. Since we push modules regularly but don't perform verification on our end the GitHub Action on main can fail even if the CI on the PR is green. It would be great to add a verification step to the tests so we are sure the modules can be pushed without issue.

See #295 for example.

cc @mathemakitten @sashavor

lvwerra avatar Sep 21 '22 07:09 lvwerra

~~Good call. It seems like the Hub repo pings the /api/validate-yaml endpoint, which is a Gitaly hook (?)~~

mathemakitten avatar Sep 22 '22 02:09 mathemakitten

If we don't care about matching exactly the way that the Hub does it, we can do it in pure python with pyyaml:

with open('README.md') as f_yaml:
    x = yaml.safe_load_all(f_yaml)
    y = next(x)  # if it's invalid yaml then it'll throw an error

mathemakitten avatar Oct 03 '22 15:10 mathemakitten

Yes, that's what I had in mind. We probably need to strip out the YAML part of the README first. This can be done with this regex.

Then we just need to iterate over all README's of metrics/measurements/comparisons and that should be it 😄

Alternatively, we could see if there is interest from the huggingface_hub side to add a validate_yaml method to the repo card class (or as a helper function). cc @Wauplin

lvwerra avatar Oct 04 '22 11:10 lvwerra