mesa added vale to CLI

Summary

I added Vale, a prose linter to CLI as requested in "Add prose linter to CI #1572 " issue. So, each time a user pushes/ creates a PR, the CLI automatically runs, and its suggestions and warnings can be viewed. Currently, I have configured it to only check on .md files, but that can be changed according to use case by altering the " .vale.ini "file.

Screenshots

Currently after running on the existing repo, vale made a lot of recommendation and suggestions. I intentionally created a "bad" .md file titled TESTVALE.md just to test it, the results are attached below: It gave some errors and warnings on the existing files as well. I have attached a log to that below. 4_Run Vale.txt

Closes #1572.

Dec 24 '24 18:12 sanika-n

Thanks for the PR! I appreciate

It's currently over a thousands lines of code spread over 45 files, which is quite challenging to review. I also prefer not to add such amounts of code to our codebase. Are all the files critical for this functionality to function? Anything you might be able to cleanup?

Edit: Going quickly though the log, the warning Prefer 'personal digital assistant' over 'agent'. or (Agent) appears a lot. This is probably an LLM artifact.

Also, could Vale be integrated into our pre-commit config?

Edit 2: It seems Vale supports pre-commit!

https://github.com/errata-ai/vale/pull/558

Dec 24 '24 20:12 EwoutH

@EwoutH, I updated the version of vale and also suppressed vale to allow the use of "agent". I had actually cloned code from this repo to -> https://github.com/errata-ai/Microsoft/tree/master/Microsoft to serve as the style for vale and hence the large amount of files, I tried to find a work around, but the only option seems to be to use it as a submodule... I did integrate vale to pre-commit and it does fail as not all md files are following vale's rules.

However, the reason for failure in the checks above, (pre-commit.ci -pr) says that it failed because it wasn't able to find vale, I am not able to figure out why that is happening

Dec 25 '24 19:12 sanika-n

Maybe it’s easier to start with a minimal implementation, using only pre-commit, and then extend from there.

Dec 25 '24 20:12 EwoutH

@EwoutH, Hi, I think I figured it out it is working now:)

Dec 26 '24 07:12 sanika-n

With 45 files changed, this PR has become impossible to review. I see a lot of value in adding this, however, would it be possible to have only this one change? Next, in follow up pr's we can go throught the code base to fix all issues that are flagged after vale has been activated.

Dec 28 '24 08:12 quaquel

Sure, so from what I understand you would like me to cut down on the number of files right? I think one thing that I could is change the style from Microsoft to Write-Good (https://github.com/errata-ai/write-good/tree/master/write-good) this only has 9 yml files...Would that be fine?

Dec 28 '24 13:12 sanika-n

My bad, I had not carefully checked the PR. It is not 45 files changed, but 1 file changed (pre-commit) and 44 new files required for vale. Can you briefly explain the options and why you went with the Microsoft style?

Dec 28 '24 14:12 quaquel

Sure, so officially these are the styles that vale recognizes: (https://github.com/errata-ai/packages)

I chose Microsoft because it has a wider range of error catching abilities and can focus on a lot more writing elements compared to some other options. Write-good is alright too but it is more basic and won't be able to catch errors that Microsoft can.

Dec 28 '24 17:12 sanika-n

Finally had time to look closer at Vale. It seems they suggest to put the styles and .vale folders inside the .github folder. Can you ensure the layout follows their suggestions? Once that is done, I think this is good to go.

Dec 30 '24 20:12 quaquel

Done :thumbsup:

Dec 31 '24 06:12 sanika-n

I think this looks good, but it would be good if one of the other maintainers also takes a look.

Dec 31 '24 08:12 quaquel

Thanks for working on it.

When I review an PR one of the first things I do is judge the costs-benefit ratio of the PR. Each new feature adds complexity, is something that can break, and is something we need to update and maintain.

With this PR, I'm not yet fully convinced that ratio/proportionality is there.

Benefits: We already have Codespell for spelling/language, a code linter (ruff), a docstring linter (ruff-format) and a few smaller checkers for things like whitespaces. That seems to cover most stuff, I don't see the distinct value of Vale for us.
Costs: This PR adds a huge fileset that probably needs to be updated periodically. Can that be automated? A git submodule might be possible, but since it's the first, we need to update that.

So, while neither on their own are a problem, I currently don't see the ratio here. Either (personally) I need to be convinced of the benefits a prose linter adds, or the complexity needs to come way down (thing of one or at most two files that can be updated automatically periodically). Can we use the Microsoft or Google standard in some way without copying all their files to our repo?

Dec 31 '24 08:12 EwoutH

I completely understand, I did try to find alternatives to copying their repo, but I couldn't find much results... will surely update if I find a workaround

Dec 31 '24 11:12 sanika-n

I'm closing this PR as stale, but feel free to open a new one (or continue the discussion) if new insights pop up!

Nov 07 '25 10:11 EwoutH