copier icon indicating copy to clipboard operation
copier copied to clipboard

Check `copier.yaml` for hyphenated variables

Open InAnYan opened this issue 10 months ago • 4 comments

Actual Situation

Jinja doesn't allow variables to contain hyphens (-), however YAML supports this.

Haven't thought much, I created templates using variables with hyphen. And I used hyphenated variable names in copier.yaml. But copier (as a whole) mistreated dataset-name as dataset - name. And I had a bit hard time catching and understanding what went wrong.

Desired Situation

What if copier could perform a check of copier.yaml before generating a project and raising an error if there is a variable in copier.yaml that contains a hyphen?

Well, I guess I'm in the minority, not many people are fan or heavy users of hyphens. IDK if it makes sense to solve this issue just for me. But I believe a "semantic check" of a config file seems like a valid and quite useful architectural step.

Actually, this is a "common" (in some sense of common) user story:

All listed software uses Jinja for templating and users accidentally used hyphens in variable names. And were confused.

Proposed solution

Before generating the project, copier can loop through variables of copier.yaml and check if the name contains -, then raise an error.

Though, it shouldn't be just the ordinary Python exception, as it would halt the execution. Let's not act like the current Zig compiler...

You already have InvalidConfigFileError (https://github.com/copier-org/copier/blob/master/copier/errors.py#L31). Though it only used in parsing YAML files (https://github.com/copier-org/copier/blob/master/copier/template.py#L102)

InAnYan avatar Mar 13 '25 11:03 InAnYan

I think the problem could be solves if we created Pydantic models for copier.yaml content with a regex validator for top-level keys without a leading underscore. Then, Pydantic would raise a validation error when using malformed question variables (and catch other errors as well). I've been wanting to do this for a long time, also because then we could generate a JSON schema for copier.yaml, but Copier's questions specifications with templated fields and implicit type (inferable from default), configurable Jinja settings via _envops, etc. makes Pydantic modeling not quite straightforward. But I fully agree with your request for better validation and error reporting.

sisp avatar Mar 13 '25 11:03 sisp

Another use case for the semantic check:

Config file has internal settings that start with _: https://copier.readthedocs.io/en/stable/configuring/#available-settings.

What if user includes a variable like _abc? IDK what copier does currently, but I see 2 ways of handling such variables:

InAnYan avatar Mar 13 '25 12:03 InAnYan

I haven't tried it, but I believe it Copier silently ignores any _-prefixed variables that aren't known settings. But a well-designed Pydantic model would catch those unknown settings. Perhaps I can find some time to revise this topic again. 🤞

sisp avatar Mar 13 '25 12:03 sisp

I guess this means non-python people are using Copier, which is good since we've made significant efforts over years to make it non-python-specific as much as possible.

@sisp if you find the pedantic stuff too complex, we can do a simple test and check for this use case. Or just drop a line in the docs.

yajo avatar Mar 16 '25 20:03 yajo