sloth icon indicating copy to clipboard operation
sloth copied to clipboard

bug: failure to generate causes empty files

Open cxdy opened this issue 3 months ago • 1 comments

INFO[0000] Generating from Prometheus spec               version=v0.13.1 window=30d
error: "generate" command failed: could not generate Prometheus format rules: could not generate prometheus rules: invalid SLO group: Key: 'SLOGroup.SLOs[0].SLI.Events.ErrorQuery' Error:Field validation for 'ErrorQuery' failed on the 'required' tag
Key: 'SLOGroup.SLOs[0].SLI.Events.TotalQuery' Error:Field validation for 'TotalQuery' failed on the 'required' tag
Key: 'SLOGroup.SLOs[1].SLI.Events.ErrorQuery' Error:Field validation for 'ErrorQuery' failed on the 'required' tag
Key: 'SLOGroup.SLOs[1].SLI.Events.TotalQuery' Error:Field validation for 'TotalQuery' failed on the 'required' tag
Key: 'SLOGroup.SLOs[2].SLI.Events.ErrorQuery' Error:Field validation for 'ErrorQuery' failed on the 'required' tag
Key: 'SLOGroup.SLOs[2].SLI.Events.TotalQuery' Error:Field validation for 'TotalQuery' failed on the 'required' tag
Key: 'SLOGroup.SLOs[3].SLI.Events.ErrorQuery' Error:Field validation for 'ErrorQuery' failed on the 'required' tag
Key: 'SLOGroup.SLOs[3].SLI.Events.TotalQuery' Error:Field validation for 'TotalQuery' failed on the 'required' tagINFO[0000] SLI plugins loaded                            plugins=0 svc=storage.FileSLIPlugin version=v0.13.1 window=30d

We had an issue where there was an invalid Sloth file which resulted in this particular file failing to render, and it seems that if sloth generate hits errors like this, it just bails completely.

We have a make command called make sloth that runs sloth generate against multiple directories, and the second it hit this file it bailed out and proceeded to generate the other directories.

This also resulted in every "generated" file from sloth generate in that directory being emptied.

We should handle this error better so it doesn't wipe an entire directory. If it can't render a file due to errors, skip it and move on.

fwiw, we are still on https://github.com/linode-obs/sloth v0.13.1, but we haven't (to my knowledge) made any changes to this area.

cxdy avatar Sep 23 '25 01:09 cxdy

make sloth btw:

sloth:
	@set -e
	sloth_version=$$(sloth version); \
	if [ "$$sloth_version" != "v0.13.1" ]; then \
		echo "Sloth missing or wrong version installed, need v0.13.1."; \
		echo "https://website/"; \
		exit 1; \
	fi; \
	for sloth_dir in $(SLOTH_INFRA_DIR) $(SLOTH_NETWORK_DIR) $(SLOTH_DEVCLOUD_DIR); do \
		output="../$$(basename $$sloth_dir)"; \
		if [ "$$sloth_dir" == "../sloth/infra" ]; then \
			sloth generate --default-slo-period=10d --slo-period-windows-path=$(SLOTH_WINDOWS_DIR) -i $$sloth_dir/somewhere/system1.yaml -o $$output/somewhere/system1_slo.yaml; \
			sloth generate --default-slo-period=30d --slo-period-windows-path=$(SLOTH_WINDOWS_DIR)/thing -i $$sloth_dir/place/thing/system2.yaml -o $$output/place/thing/system2_slo.yaml; \
		fi; \
		sloth generate -i $$sloth_dir  -o $$output --fs-include='.*\.yaml' --fs-exclude='.*(system1|system2)_slo.yaml'; \
	done

Maybe our fault? tomorrow problem

cxdy avatar Sep 23 '25 01:09 cxdy