vale icon indicating copy to clipboard operation
vale copied to clipboard

Multi-word vocabulary not accepted

Open imnotashrimp opened this issue 10 months ago • 1 comments

Check for existing issues

  • [x] Completed

Environment

  • OS: macOS Sequoia 15.1.1
  • Install method: Homebrew
  • Vale version: 3.9.6

Describe the bug / provide steps to reproduce it

I added in my accept.txt file the term "Azure CLI". However, there's a conflicting Google rule that "CLI" should be replaced with "command-line tool", so all my instances of "Azure CLI" are being flagged with a warning. I don't see anywhere in the Vale docs that multi-word terms are not supported in vocabularies. I'm sure there's some small detail of how Vale actually logics this, but I can't seem to find it.

I'm getting the same results for Azure\sCLI.

My config file is as follows:

StylesPath = styles
MinAlertLevel = suggestion
Packages = Google, proselint, MDX
Vocab = Docs, AWS
IgnoredScopes = code, tt, b, strong

[*.{md,mdx}] # Vale checks only these file extensions
BasedOnStyles = Vale, Google, Eon, AWS, proselint # Eon and AWS are my custom packages, which I use for vocabularies
BlockIgnores = (?s)(<Tabs .+?>), (?s).*(<CodeBlock .+?>.+?</CodeBlock>)
Google.Exclamation = warning
Google.OxfordComma = suggestion
Google.Colons = NO
Google.Passive = NO
Google.Quotes = NO
Google.We = NO

Steps to reproduce:

  1. Using the config above, in the AWS accept.txt, add the term Azure CLI.
  2. In a MD or MDX file, write a sentence that includes "Azure CLI".
  3. Run vale. You'll see a Google.WordList warning for using "CLI".
  4. In accept.txt, change Azure CLI to Azure\sCLI.
  5. Run vale again. You'll see the same warning.
  6. Now, in accept.txt, change Azure\sCLI to CLI.
  7. Run vale again. You'll now not get the warning.

I take this to mean the config file and vocabulary accept.txt files are fine, and that the issue is possibly that Vale doesn't recognize the space character. However, this isn't mentioned explicitly in the docs. I think the expected behavior is for Vale to recognize the multi-word Azure CLI or Azure\sCLI term in the accept.txt file.

imnotashrimp avatar Mar 12 '25 06:03 imnotashrimp

The reason for this, per my understanding, is that the substitution rule uses word boundaries, so it's always going to see CLI on its own before it sees your vocab exception.

The way I solved for this is twofold:

  1. I never included Google or MS packages directly in my style. I use vale sync to vendor them and copy over the files I want. Then I edit word lists appropriately.
  2. In the cases where I have a situation where I have a "banned word" but a product name that includes the banned word, I replace the rule with a regex that has a lookahead:
extends: substitution
message: Consider replacing '%s' with '%s'
level: warning
ignorecase: true
swap:
  (?<!Azure\s)CLI: Command-line Interface

This largely works, although if you have two instances of CLI on the same line, where the first is "Azure CLI", Vale identifies the "CLI" part of Azure CLI as the column containing the error.

napcs avatar Apr 17 '25 18:04 napcs