Check `copier.yaml` for hyphenated variables
Actual Situation
Jinja doesn't allow variables to contain hyphens (-), however YAML supports this.
Haven't thought much, I created templates using variables with hyphen. And I used hyphenated variable names in copier.yaml. But copier (as a whole) mistreated dataset-name as dataset - name. And I had a bit hard time catching and understanding what went wrong.
Desired Situation
What if copier could perform a check of copier.yaml before generating a project and raising an error if there is a variable in copier.yaml that contains a hyphen?
Well, I guess I'm in the minority, not many people are fan or heavy users of hyphens. IDK if it makes sense to solve this issue just for me. But I believe a "semantic check" of a config file seems like a valid and quite useful architectural step.
- https://github.com/apache/airflow/issues/8688
- https://stackoverflow.com/questions/52396669/ansible-templating-skips-string-after-a-dash
- https://github.com/ansible/ansible/issues/3907
All listed software uses Jinja for templating and users accidentally used hyphens in variable names. And were confused.
Proposed solution
Before generating the project, copier can loop through variables of copier.yaml and check if the name contains -, then raise an error.
Though, it shouldn't be just the ordinary Python exception, as it would halt the execution. Let's not act like the current Zig compiler...
You already have InvalidConfigFileError (https://github.com/copier-org/copier/blob/master/copier/errors.py#L31). Though it only used in parsing YAML files (https://github.com/copier-org/copier/blob/master/copier/template.py#L102)
I think the problem could be solves if we created Pydantic models for copier.yaml content with a regex validator for top-level keys without a leading underscore. Then, Pydantic would raise a validation error when using malformed question variables (and catch other errors as well). I've been wanting to do this for a long time, also because then we could generate a JSON schema for copier.yaml, but Copier's questions specifications with templated fields and implicit type (inferable from default), configurable Jinja settings via _envops, etc. makes Pydantic modeling not quite straightforward. But I fully agree with your request for better validation and error reporting.
Another use case for the semantic check:
Config file has internal settings that start with _: https://copier.readthedocs.io/en/stable/configuring/#available-settings.
What if user includes a variable like _abc? IDK what copier does currently, but I see 2 ways of handling such variables:
- Disallow such variables. Check that if a variable starts with
_it must be from https://copier.readthedocs.io/en/stable/configuring/#available-settings. - Or allow
_abcto be a variable, but warn users that typically only internal settings start from_.
I haven't tried it, but I believe it Copier silently ignores any _-prefixed variables that aren't known settings. But a well-designed Pydantic model would catch those unknown settings. Perhaps I can find some time to revise this topic again. 🤞
I guess this means non-python people are using Copier, which is good since we've made significant efforts over years to make it non-python-specific as much as possible.
@sisp if you find the pedantic stuff too complex, we can do a simple test and check for this use case. Or just drop a line in the docs.