charts
charts copied to clipboard
Error on run otel-collector-migrate-init: jobs.batch "signoz-schema-migrator" not found
Error on run otel-collector-migrate-init:
signoz-otel-collector-migrate-init Error from server (NotFound): jobs.batch "signoz-schema-migrator" not found
No more logs. The pod crashes with this message
Helm chart version: 0.31.1
signoz/signoz-otel-collector version: 0.88.3
@prashant-shahi
Hello Having the same issue, found this discussion - https://knowledgebase.signoz.io/t/issue-with-upgrading-helm-and-schema-migrator-failure/2Ka466
@prashant-shahi I think I've managed to reproduce the issue:
I was not able to reproduce the issue in case I uninstall previously successfully installed chart, and try to reinstall.
I've managed to reproduce the issue when I've uninstalled a chart, dropped the namespace with all the resources that chart was not able to delete (schema-migrator job also was not deleted by a helmchart), and install chart from scratch.
The fix that works now for me as a workaround, is to create schema-migrator
job during the installation of the helm chart.
I assume it is somehow related to this pre-install
vs post-install
helm hooks mechanics
In my opinion helm is not creating this job during the installation phase because it is defined as a post-install
hook, but at the same time otel-collector
depends on this job and not being able to startup without it.
Also because of that helmchart fails to install and there is no post-install
phase so there is not job created.
@Vladimir-Kuchinskiy can you share versions of your Helm and signoz helm chart that is being used?
@prashant-shahi I am using latest helm terraform provider 2.12.1
https://registry.terraform.io/providers/hashicorp/helm/latest/docs
regarding signoz chart, it is 0.31.1
version
I have the same issue.
chart version: 0.31.2
helm v3.13.3
Error from server (NotFound): jobs.batch "signoz-schema-migrator" not found
Here is a temporary fix for helm terraform - set wait
to false, thank you @prashant-shahi pointing out the hooks
resource "helm_release" "my_signoz" {
name = "my-signoz"
repository = "https://charts.signoz.io"
chart = "signoz"
namespace = "observability"
wait = false
}
I saw the same issue with 0.34.3 -> 0.35.2 upgrade, the signoz-otel-collector-metrics
and signoz-otel-collector
pods failed to start because the init containers failed with this "job not found error". So Helm never advanced to the post-install stage where this job is created.
I'm using Pulumi and also "solved" this by setting skipAwait: true
, but this is not great overall, this could mark the release as healthy when there's a genuine issue.
Also seeing this when upgrading from chart 0.37.1
to 0.39.0
. Rather than using wait = false
, I opted to try @Volodymyr-Kuchinskyi 's workaround. I created two Job
resources: one called signoz-schema-migrator
and another called signoz-schema-migrator-upgrade
. This was sufficient to appease the two crashing pods, and the actual upgrade process still appeared to take place after the pods started successfully.
I get this everytime I try to change anything, using pulumi
❯ kubectl logs signoz-otel-collector-7876b4f447-6mx96 -c signoz-otel-collector-migrate-init
Error from server (NotFound): jobs.batch "signoz-schema-migrator-upgrade" not found
helm.NewRelease(ctx, "signoz", &helm.ReleaseArgs{
WaitForJobs: pulumi.Bool(false),
Chart: pulumi.String("signoz"),
Version: pulumi.String("0.44.0"),
Name: pulumi.String("signoz"),
Namespace: pulumi.String("signoz"),
CreateNamespace: pulumi.Bool(true),
RepositoryOpts: helm.RepositoryOptsArgs{
Repo: pulumi.String("https://charts.signoz.io"),
},
Values: pulumi.Map{
"otelCollectorMetrics": pulumi.Map{
"enabled": pulumi.Bool(false),
},
"k8s-infra": pulumi.Map{
"enabled": pulumi.Bool(false),
},
},
}, opts...)
Everytime i try to update a live installation with new helm-values i get the error.
I'm experiencing this issue as well using a helm install. It looks like it was caused when I messed up the first install, removed it using helm, then tried to reinstall.