Improve handling of empty command lists in Autoinstall YAML
This fixes an issue with an Autoinstall user-data YAML such as the following:
#cloud-config
autoinstall:
version: 1
late-commands:
# Only entry commented out
#- echo "Hello" > /etc/motd
This is being tracked in the following bug: https://bugs.launchpad.net/subiquity/+bug/2078658
A test case was also added that demostrates the crash and verifies the fix.
- Disabled some prints cluttering up test output
- Added test cases for empty command lists
- Don't attempt to enumerate over an empty command list
What's the motivation for allowing an empty command list section? I would prefer it if we instead improved the validation error messaging by improving the json schema for the CmdListController class to not allow an empty object. An empty section feels unintentional to me (you can always just comment out the whole block).
Unrelated to the code changes: Have you signed the Canonical CLA? The CLA bot says no but sometimes a manual nudge is necessary to get it happy.
@Chris-Peterson444 The reason is that the current behavior causes the installer to simply crash with an ugly backtrace that doesn't indicate anywhere in the output what line of the configuration file it is choking on or even that it was the autoinstall configuration. Even the validator does not tell you more than "NoneType is not an iterable" with no indication of what YAML key or line in the file was bad. That is a bad user experience that even a seasoned Linux veteran, which I consider myself, took more than an hour trying to figure out what was wrong. It was a simple fix, but this two-line addition to the logic makes it merely ignore those empty keys. Even if this isn't considered the correct fix, I would prefer a fix that at least indicated the line and/or key that is causing the validation to fail.
I have tried to submit the CLA, but I don't know what to fill in for the field: "Please add the Canonical Project Manager or contact". It won't let me submit it without that value. What should I put in there?
nitpick: we typically use conventional commits. e.g. "ci: add test case for empty cmdlist" or similar
As a "new to Canonical" contributor, I am happy to accept nitpicks. I can update that commit message in my next rebase.
If you do want this to be changed to a validation error, I can work on that. It will just take a little longer as I need to dig a little deeper into the guts of Subiquity.
Indeed, the error messaging isn't great yet and very non-obvious for some cases, such as this one. The validator needs some work and thank you for spending the time to track this down and make it better!
I would suggest something like this:
diff --git a/subiquity/server/controllers/cmdlist.py b/subiquity/server/controllers/cmdlist.py
index 2729da5c..4f77e7df 100644
--- a/subiquity/server/controllers/cmdlist.py
+++ b/subiquity/server/controllers/cmdlist.py
@@ -55,6 +55,7 @@ class CmdListController(NonInteractiveController):
"type": ["string", "array"],
"items": {"type": "string"},
},
+ "not": {},
}
builtin_cmds: Sequence[Command] = ()
cmds: Sequence[Command] = ()
which will at least result in providing the offending section. Using your CI autoinstall as an example:
$ ./scripts/validate-autoinstall-user-data.py --no-expect-cloudconfig autoinstall.yaml
Malformed autoinstall in 'error-commands' section
Failure: The provided autoinstall config failed validation
or
$ ./scripts/validate-autoinstall-user-data.py -vvv --no-expect-cloudconfig autoinstall.yaml
start: subiquity/load_autoinstall_config:
start: subiquity/load_autoinstall_config/read_config:
finish: subiquity/load_autoinstall_config/read_config: SUCCESS:
start: subiquity/Reporting/load_autoinstall_data:
finish: subiquity/Reporting/load_autoinstall_data: SUCCESS:
start: subiquity/Integrity/load_autoinstall_data:
finish: subiquity/Integrity/load_autoinstall_data: SUCCESS:
start: subiquity/Error/load_autoinstall_data:
finish: subiquity/Error/load_autoinstall_data: FAIL: Malformed autoinstall in 'error-commands' section
finish: subiquity/load_autoinstall_config: FAIL: Malformed autoinstall in 'error-commands' section
Malformed autoinstall in 'error-commands' section
Traceback (most recent call last):
File "/home/cpete/software/subiquity/scripts/../subiquity/server/controller.py", line 62, in validate_autoinstall
jsonschema.validate(ai_data, self.autoinstall_schema)
File "/usr/lib/python3/dist-packages/jsonschema/validators.py", line 1080, in validate
raise error
jsonschema.exceptions.ValidationError: [] should not be valid under {}
Failed validating 'not' in schema:
{'items': {'items': {'type': 'string'}, 'type': ['string', 'array']},
'not': {},
'type': 'array'}
On instance:
[]
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/cpete/software/subiquity/./scripts/validate-autoinstall-user-data.py", line 170, in verify_autoinstall
app.load_autoinstall_config(only_early=True, context=None)
File "/home/cpete/software/subiquity/scripts/../subiquitycore/context.py", line 159, in decorated_sync
return meth(self, **kw)
^^^^^^^^^^^^^^^^
File "/home/cpete/software/subiquity/scripts/../subiquity/server/server.py", line 750, in load_autoinstall_config
self.controllers.Error.setup_autoinstall()
File "/home/cpete/software/subiquity/scripts/../subiquity/server/controller.py", line 91, in setup_autoinstall
self.validate_autoinstall(ai_data)
File "/home/cpete/software/subiquity/scripts/../subiquity/server/controller.py", line 71, in validate_autoinstall
raise new_exception from original_exception
subiquity.server.autoinstall.AutoinstallValidationError: Malformed autoinstall in 'error-commands' section
Failure: The provided autoinstall config failed validation
There is perhaps a more elegant way to say "not empty" but this was my first approach.
I have tried to submit the CLA, but I don't know what to fill in for the field: "Please add the Canonical Project Manager or contact". It won't let me submit it without that value. What should I put in there?
I actually don't know to be honest. I suppose since you're contributing to subiquity you could put "subiquity" down, but I'm not sure this field matters for individual contributors.
I'll also add in the user experience that lead me to this issue.
While attempting to automate my install process, I had a small autoinstall file with a single entry in late-commands. Something like this:
autoinstall:
version: 1
...
late-commands:
- echo "Hello"
The exact command was different. I then made quite a number of changes including commenting out that command with a pound sign, but I did not comment out the key in front of it. After that, I found my installer was crashing and my initial suspicions were related to other changes I had made. It took me some time to realize that the crash was due to this file:
autoinstall:
version: 1
...
late-commands:
#- echo "Hello"
Thank you for sharing! This is exactly the kind of scenarios I want to avoid.
It took me some time to realize that the crash was due to this file
Yeah autoinstall errors can be elusive. We've done some work to make the runtime errors more obvious but the easiest ones to catch are ones like this where we can screen for them at validation time.
I have tried to submit the CLA, but I don't know what to fill in for the field: "Please add the Canonical Project Manager or contact". It won't let me submit it without that value. What should I put in there?
I actually don't know to be honest. I suppose since you're contributing to subiquity you could put "subiquity" down, but I'm not sure this field matters for individual contributors.
For future note, I think that contact comes from this list: https://canonical.com/projects/directory. I have sent Matthieu an email.
If you don't want to merge this until it's changed to a validation error, feel free to change it to a Draft PR.
If you don't want to merge this until it's changed to a validation error, feel free to change it to a Draft PR.
Sure, let's do that then. I think something along the lines I suggested would be the right direction. A change to the jsonschema along with some unittests to verify the behavior would be good. You can use the exact change I suggested, or something different if you prefer.
Some tips: validation happens via a call to a controller's validate_autoinstall method, which is defined here. It's not a perfect example, but this test may be worth a read for an idea of how to setup the test. Although the real tests should land in test_cmdlist.py. If you'd like some more concrete pointers or have any questions, please feel free to ask!