charts icon indicating copy to clipboard operation
charts copied to clipboard

Error on run otel-collector-migrate-init: jobs.batch "signoz-schema-migrator" not found

Open voriol opened this issue 1 year ago • 11 comments

Error on run otel-collector-migrate-init:

signoz-otel-collector-migrate-init Error from server (NotFound): jobs.batch "signoz-schema-migrator" not found

No more logs. The pod crashes with this message

Helm chart version: 0.31.1 signoz/signoz-otel-collector version: 0.88.3

voriol avatar Dec 15 '23 12:12 voriol

@prashant-shahi

srikanthccv avatar Dec 15 '23 13:12 srikanthccv

@prashant-shahi I think I've managed to reproduce the issue:

I was not able to reproduce the issue in case I uninstall previously successfully installed chart, and try to reinstall.

I've managed to reproduce the issue when I've uninstalled a chart, dropped the namespace with all the resources that chart was not able to delete (schema-migrator job also was not deleted by a helmchart), and install chart from scratch.

The fix that works now for me as a workaround, is to create schema-migrator job during the installation of the helm chart.

I assume it is somehow related to this pre-install vs post-install helm hooks mechanics In my opinion helm is not creating this job during the installation phase because it is defined as a post-install hook, but at the same time otel-collector depends on this job and not being able to startup without it. Also because of that helmchart fails to install and there is no post-install phase so there is not job created.

Volodymyr-Kuchinskyi avatar Dec 15 '23 17:12 Volodymyr-Kuchinskyi

@Vladimir-Kuchinskiy can you share versions of your Helm and signoz helm chart that is being used?

prashant-shahi avatar Dec 15 '23 20:12 prashant-shahi

@prashant-shahi I am using latest helm terraform provider 2.12.1 https://registry.terraform.io/providers/hashicorp/helm/latest/docs regarding signoz chart, it is 0.31.1 version

Volodymyr-Kuchinskyi avatar Dec 18 '23 22:12 Volodymyr-Kuchinskyi

I have the same issue. chart version: 0.31.2 helm v3.13.3

Error from server (NotFound): jobs.batch "signoz-schema-migrator" not found

4nte avatar Dec 20 '23 01:12 4nte

Here is a temporary fix for helm terraform - set wait to false, thank you @prashant-shahi pointing out the hooks

resource "helm_release" "my_signoz" {
  name = "my-signoz"

  repository = "https://charts.signoz.io"
  chart      = "signoz"
  namespace  = "observability"

  wait = false
}

scorpionknifes avatar Jan 18 '24 12:01 scorpionknifes

I saw the same issue with 0.34.3 -> 0.35.2 upgrade, the signoz-otel-collector-metrics and signoz-otel-collector pods failed to start because the init containers failed with this "job not found error". So Helm never advanced to the post-install stage where this job is created.

I'm using Pulumi and also "solved" this by setting skipAwait: true, but this is not great overall, this could mark the release as healthy when there's a genuine issue.

haimgel avatar Feb 18 '24 15:02 haimgel

Also seeing this when upgrading from chart 0.37.1 to 0.39.0. Rather than using wait = false, I opted to try @Volodymyr-Kuchinskyi 's workaround. I created two Job resources: one called signoz-schema-migrator and another called signoz-schema-migrator-upgrade. This was sufficient to appease the two crashing pods, and the actual upgrade process still appeared to take place after the pods started successfully.

codekoala avatar Apr 17 '24 12:04 codekoala

I get this everytime I try to change anything, using pulumi

❯ kubectl logs signoz-otel-collector-7876b4f447-6mx96 -c signoz-otel-collector-migrate-init
Error from server (NotFound): jobs.batch "signoz-schema-migrator-upgrade" not found
helm.NewRelease(ctx, "signoz", &helm.ReleaseArgs{
		WaitForJobs:     pulumi.Bool(false),
		Chart:           pulumi.String("signoz"),
		Version:         pulumi.String("0.44.0"),
		Name:            pulumi.String("signoz"),
		Namespace:       pulumi.String("signoz"),
		CreateNamespace: pulumi.Bool(true),
		RepositoryOpts: helm.RepositoryOptsArgs{
			Repo: pulumi.String("https://charts.signoz.io"),
		},
		Values: pulumi.Map{
			"otelCollectorMetrics": pulumi.Map{
				"enabled": pulumi.Bool(false),
			},

			"k8s-infra": pulumi.Map{
				"enabled": pulumi.Bool(false),
			},
		},
	}, opts...)

Everytime i try to update a live installation with new helm-values i get the error.

afreakk avatar Jun 21 '24 12:06 afreakk

I'm experiencing this issue as well using a helm install. It looks like it was caused when I messed up the first install, removed it using helm, then tried to reinstall.

wfhartford avatar Sep 04 '24 18:09 wfhartford