fleet icon indicating copy to clipboard operation
fleet copied to clipboard

"SIGSEGV: segmentation violation" while trying to deploy Elemental via fleet

Open e-minguez opened this issue 1 year ago • 7 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Current Behavior

Trying to deploy elemental via fleet , the gitrepo is failing as:

$ kubectl get gitrepo elemental -n fleet-local -o yaml
...
  - lastUpdateTime: "2023-06-16T15:43:03Z"
    message: "panic: runtime error: invalid memory address or nil pointer dereference\n[signal
      SIGSEGV: segmentation violation code=0x1 addr=0x38 pc=0x155764b]\n\ngoroutine
      52 [running]:\nhelm.sh/helm/v3/pkg/registry.(*Client).Tags(0x0, {0xc0000a0186?,
      0xc0004753f8?})\n\t/go/pkg/mod/github.com/rancher/helm/[email protected]/pkg/registry/client.go:602
      +0x12b\nhelm.sh/helm/v3/pkg/downloader.(*ChartDownloader).getOciURI(0xc000475910,
      {0xc0000a0180, 0x60}, {0x0, 0x0}, 0xc0008377a0)\n\t/go/pkg/mod/github.com/rancher/helm/[email protected]/pkg/downloader/chart_downloader.go:154
      +0x129\nhelm.sh/helm/v3/pkg/downloader.(*ChartDownloader).ResolveChartVersion(0xc000475910,
      {0xc0000a0180, 0x60}, {0x0, 0x0})\n\t/go/pkg/mod/github.com/rancher/helm/[email protected]/pkg/downloader/chart_downloader.go:199
      +0x13df\nhelm.sh/helm/v3/pkg/downloader.(*ChartDownloader).DownloadTo(0xc000475910,
      {0xc0000a0180, 0x60}, {0x0?, 0x20063a0?}, {0xc00083b338, 0x14})\n\t/go/pkg/mod/github.com/rancher/helm/[email protected]/pkg/downloader/chart_downloader.go:90
      +0x5b\ngithub.com/rancher/fleet/pkg/bundlereader.downloadOCIChart({0xc0000a0180,
      0x60}, {0x0, 0x0}, {0xc00083b338, 0x14}, {{0x0, 0x0}, {0x0, 0x0}, ...})\n\t/go/src/github.com/rancher/fleet/pkg/bundlereader/loaddirectory.go:192
      +0x3d6\ngithub.com/rancher/fleet/pkg/bundlereader.getContent({0x27d1ae0, 0xc00083cc00},
      {0xc0006f2270, 0x29}, {0xc0000a0180, 0x60}, {0x0, 0x0}, {{0x0, 0x0}, ...})\n\t/go/src/github.com/rancher/fleet/pkg/bundlereader/loaddirectory.go:74
      +0x1b2\ngithub.com/rancher/fleet/pkg/bundlereader.loadDirectory({0x27d1ae0?,
      0xc00083cc00?}, 0x0, {0xc000510190, 0x47}, {0xc0006f2270?, 0x0?}, {0xc0000a0180?,
      0x0?}, {0x0, ...}, ...)\n\t/go/src/github.com/rancher/fleet/pkg/bundlereader/loaddirectory.go:29
      +0x9f\ngithub.com/rancher/fleet/pkg/bundlereader.loadDirectories.func1()\n\t/go/src/github.com/rancher/fleet/pkg/bundlereader/resources.go:216
      +0x126\ngolang.org/x/sync/errgroup.(*Group).Go.func1()\n\t/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:75
      +0x64\ncreated by golang.org/x/sync/errgroup.(*Group).Go\n\t/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:72
      +0xa5\n"
    reason: Stalled
    status: "True"
    type: Stalled

the gitrepo object is just:

kind: GitRepo
apiVersion: fleet.cattle.io/v1alpha1
metadata:
  name: elemental
  namespace: fleet-local
spec:
  repo: https://github.com/suse-edge/misc.git
  branch: elemental-fleet
  paths:
  - fleet-examples/fleets/elemental/*

and the repo contains the fleet.yamls definition here https://github.com/suse-edge/misc/tree/elemental-fleet/fleet-examples/fleets/elemental, including:

  • elemental-operator/fleet.yaml
defaultNamespace: cattle-elemental-system
helm:
  releaseName: elemental-operator
  chart: "oci://registry.opensuse.org/isv/rancher/elemental/stable/charts/rancher/elemental-operator-chart"
  repo: ""
  values: {}
  • rancher-ui-plugin/fleet.yaml
defaultNamespace: cattle-ui-plugin-system
helm:
  releaseName: elemental-ui-plugin
  chart: "elemental"
  repo: "https://github.com/rancher/ui-plugin-charts/"
  timeoutSeconds: 0
  values: {}
  • ui-plugin-operator-crd/fleet.yaml
defaultNamespace: cattle-ui-plugin-system
helm:
  releaseName: ui-plugin-operator-crd
  chart: "ui-plugin-operator-crd"
  repo: "https://charts.rancher.io/"
  timeoutSeconds: 0
  values: {}
  • ui-plugin-operator/fleet.yaml
defaultNamespace: cattle-ui-plugin-system
helm:
  releaseName: ui-plugin-operator
  chart: "ui-plugin-operator"
  repo: "https://charts.rancher.io/"
  timeoutSeconds: 0
  values: {}

Expected Behavior

Fleet deploying the elemental operator and other components successfully.

Steps To Reproduce

  • Deploy K3s (v1.25.9+k3s1) on top of SLE Micro 5.4 x86_64
master-0:~ # cat /etc/os-release
NAME="SLE Micro"
VERSION="5.4"
VERSION_ID="5.4"
PRETTY_NAME="SUSE Linux Enterprise Micro 5.4"
ID="sle-micro"
ID_LIKE="suse"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:suse:sle-micro:5.4"
master-0:~ # k3s --version
k3s version v1.25.9+k3s1 (7502ce6a)
go version go1.19.8
  • Deploy Rancher 2.7.4 on top:
master-0:~ # helm list -n cattle-system
NAME           	NAMESPACE    	REVISION	UPDATED                                	STATUS  	CHART                        	APP VERSION
rancher        	cattle-system	1       	2023-06-13 09:43:19.382439183 +0000 UTC	deployed	rancher-2.7.4                	v2.7.4
rancher-webhook	cattle-system	1       	2023-06-13 09:45:07.259783412 +0000 UTC	deployed	rancher-webhook-2.0.4+up0.3.4	0.3.4
  • Try to deploy elemental using fleet with the assets from above

Environment

  • Architecture: x86_64 (Linux master-0 5.14.21-150400.24.63-default #1 SMP PREEMPT_DYNAMIC Tue May 2 15:49:04 UTC 2023 (fd0cc4f) x86_64 x86_64 x86_64 GNU/Linux)
  • Fleet Version:
$ kubectl get deploy -n cattle-fleet-system fleet-controller -o jsonpath="{.spec.template.spec.containers[].image}"
rancher/fleet:v0.6.0
  • Cluster:
    • Provider: K3s v1.25.9+k3s1
    • Options:
$ kubectl get nodes
NAME       STATUS   ROLES                       AGE     VERSION
master-0   Ready    control-plane,etcd,master   5d23h   v1.25.9+k3s1
master-1   Ready    control-plane,etcd,master   5d23h   v1.25.9+k3s1
master-2   Ready    control-plane,etcd,master   5d23h   v1.25.9+k3s1
worker-0   Ready    <none>                      5d23h   v1.25.9+k3s1
worker-1   Ready    <none>                      5d23h   v1.25.9+k3s1
  • Kubernetes Version: v1.25.9+k3s1

Logs

$ kubectl get gitrepo -A -o jsonpath='{.items[*].status}' | jq .
{
  "commit": "395e6ee35493b769e2c7c6eeca8777bd3a8e313a",
  "conditions": [
    {
      "lastUpdateTime": "2023-06-16T17:08:22Z",
      "status": "True",
      "type": "Ready"
    },
    {
      "lastUpdateTime": "2023-06-19T08:57:41Z",
      "status": "True",
      "type": "Accepted"
    },
    {
      "lastUpdateTime": "2023-06-16T15:52:52Z",
      "status": "True",
      "type": "ImageSynced"
    },
    {
      "lastUpdateTime": "2023-06-16T17:08:17Z",
      "status": "False",
      "type": "Reconciling"
    },
    {
      "lastUpdateTime": "2023-06-16T17:09:07Z",
      "message": "time=\"2023-06-16T17:08:44Z\" level=info msg=\"updated: fleet-local/elemental-fleet-examples-fleets-elemental\"\npanic: runtime error: invalid memory address or nil pointer dereference\n[signal SIGSEGV: segmentation violation code=0x1 addr=0x38 pc=0x155764b]\n\ngoroutine 54 [running]:\nhelm.sh/helm/v3/pkg/registry.(*Client).Tags(0x0, {0xc0000a15c6?, 0xc00053d3f8?})\n\t/go/pkg/mod/github.com/rancher/helm/[email protected]/pkg/registry/client.go:602 +0x12b\nhelm.sh/helm/v3/pkg/downloader.(*ChartDownloader).getOciURI(0xc00053d910, {0xc0000a15c0, 0x60}, {0x0, 0x0}, 0xc00081e5a0)\n\t/go/pkg/mod/github.com/rancher/helm/[email protected]/pkg/downloader/chart_downloader.go:154 +0x129\nhelm.sh/helm/v3/pkg/downloader.(*ChartDownloader).ResolveChartVersion(0xc00053d910, {0xc0000a15c0, 0x60}, {0x0, 0x0})\n\t/go/pkg/mod/github.com/rancher/helm/[email protected]/pkg/downloader/chart_downloader.go:199 +0x13df\nhelm.sh/helm/v3/pkg/downloader.(*ChartDownloader).DownloadTo(0xc00053d910, {0xc0000a15c0, 0x60}, {0x0?, 0x20063a0?}, {0xc0002cca38, 0x14})\n\t/go/pkg/mod/github.com/rancher/helm/[email protected]/pkg/downloader/chart_downloader.go:90 +0x5b\ngithub.com/rancher/fleet/pkg/bundlereader.downloadOCIChart({0xc0000a15c0, 0x60}, {0x0, 0x0}, {0xc0002cca38, 0x14}, {{0x0, 0x0}, {0x0, 0x0}, ...})\n\t/go/src/github.com/rancher/fleet/pkg/bundlereader/loaddirectory.go:192 +0x3d6\ngithub.com/rancher/fleet/pkg/bundlereader.getContent({0x27d1ae0, 0xc000427400}, {0xc0003c38c0, 0x32}, {0xc0000a15c0, 0x60}, {0x0, 0x0}, {{0x0, 0x0}, ...})\n\t/go/src/github.com/rancher/fleet/pkg/bundlereader/loaddirectory.go:74 +0x1b2\ngithub.com/rancher/fleet/pkg/bundlereader.loadDirectory({0x27d1ae0?, 0xc000427400?}, 0x0, {0xc0003eb180, 0x47}, {0xc0003c38c0?, 0xc0000a13e0?}, {0xc0000a15c0?, 0xc00052d9e0?}, {0x0, ...}, ...)\n\t/go/src/github.com/rancher/fleet/pkg/bundlereader/loaddirectory.go:29 +0x9f\ngithub.com/rancher/fleet/pkg/bundlereader.loadDirectories.func1()\n\t/go/src/github.com/rancher/fleet/pkg/bundlereader/resources.go:216 +0x126\ngolang.org/x/sync/errgroup.(*Group).Go.func1()\n\t/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:75 +0x64\ncreated by golang.org/x/sync/errgroup.(*Group).Go\n\t/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:72 +0xa5\n",
      "reason": "Stalled",
      "status": "True",
      "type": "Stalled"
    },
    {
      "lastUpdateTime": "2023-06-19T08:57:41Z",
      "status": "True",
      "type": "Synced"
    }
  ],
  "desiredReadyClusters": 1,
  "display": {
    "readyBundleDeployments": "1/1",
    "state": "GitUpdating"
  },
  "gitJobStatus": "Failed",
  "lastSyncedImageScanTime": null,
  "observedGeneration": 2,
  "readyClusters": 1,
  "resourceCounts": {
    "desiredReady": 0,
    "missing": 0,
    "modified": 0,
    "notReady": 0,
    "orphaned": 0,
    "ready": 0,
    "unknown": 0,
    "waitApplied": 0
  },
  "summary": {
    "desiredReady": 1,
    "ready": 1
  }
}

Anything else?

I guess from the logs that it is maybe related to the fact that the elemental operator is an OCI asset instead of a http one... but I've also tried the https://github.com/rancher/fleet-examples/blob/master/single-cluster/helm-oci/fleet.yaml example and it works fine...

Also this issue is similar https://github.com/rancher/fleet/issues/203 but not the same I guess.

e-minguez avatar Jun 19 '23 08:06 e-minguez