cluster-api-provider-hetzner icon indicating copy to clipboard operation
cluster-api-provider-hetzner copied to clipboard

Reconcile error for HetznerBareMetalHost and more than one network interface

Open alexkasatikov opened this issue 2 years ago • 8 comments

/kind bug

What steps did you take and what happened: I'm trying to set up k8s cluster with only one node using hetzner-baremetal-control-planes flavor. After generating cluster and adding HetznerBareMetalHost I don't see any detail about host hardware when doing kubectl describe hetznerbaremetalhost. Here is the log from caph-controller-manager:

Log { "level": "ERROR", "time": "2023-09-21T11:18:53.496Z", "file": "controller/controller.go:324", "message": "Reconciler error", "controller": "hetznerbaremetalhost", "controllerGroup": "infrastructure.cluster.x-k8s.io", "controllerKind": "HetznerBareMetalHost", "HetznerBareMetalHost": { "name": "de1459", "namespace": "de-dev" }, "namespace": "de-dev", "name": "de1459", "reconcileID": "9283fd2c-9da9-4274-aaae-ffbea85dbf64", "error": "failed to reconcile HetznerBareMetalHost de-dev/de1459: action \"registering\" failed: failed to get hardware details: failed to obtain hardware details Nics: failed to unmarshal {\"name\":\"eth0\",\"model\":\"Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)}. Original ssh output name=\"eth0\" model=\"Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)\nIntel Corporation I350 Gigabit Network Connection (rev 01)\" mac=\"f0:2f:74:94:a2:41\" ip=\"162.55.151.48/26\" speedMbps=\"1000\"\nname=\"eth0\" model=\"Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)\nIntel Corporation I350 Gigabit Network Connection (rev 01)\" mac=\"f0:2f:74:94:a2:41\" ip=\"2a01:4f8:262:265f::2/64\" speedMbps=\"1000\": unexpected end of JSON input", "stacktrace": "sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/src/cluster-api-provider-hetzner/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:324\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/src/cluster-api-provider-hetzner/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:265\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/src/cluster-api-provider-hetzner/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:226" }

What did you expect to happen: Reconcilation completed successfully

Anything else you would like to add: I assume that's due to this line: https://github.com/syself/cluster-api-provider-hetzner/blob/v1.0.0-beta.22/pkg/services/baremetal/client/ssh/ssh_client.go#L144 When executed on host, it returns 2 lines:

root@rescue ~ # lspci | grep net | awk '{$1=$2=$3=""; print $0}' | sed "s/^[ \t]*//"
Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
Intel Corporation I350 Gigabit Network Connection (rev 01)

and the script output is like that:

root@rescue ~ # bash nic-info.sh
name="eth0" model="Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
Intel Corporation I350 Gigabit Network Connection (rev 01)" mac="f0:2f:74:94:a2:41" ip="162.55.151.48/26" speedMbps="1000"
name="eth0" model="Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
Intel Corporation I350 Gigabit Network Connection (rev 01)" mac="f0:2f:74:94:a2:41" ip="2a01:4f8:262:265f::2/64" speedMbps="1000"

Environment:

  • cluster-api-provider-hetzner version: v1.0.0-beta.22
  • Kubernetes version: 1.25.12
  • OS (e.g. from /etc/os-release): debian 12

alexkasatikov avatar Sep 21 '23 12:09 alexkasatikov

As an idea, something like that could be used instead: lspci -s $(ethtool -i $iname | grep bus-info | awk '{print $2}') | cut -d ':' -f 3

alexkasatikov avatar Sep 21 '23 13:09 alexkasatikov

@guettli please have a look here

batistein avatar Sep 26 '23 23:09 batistein

I ran into a similar error on a RX220 Host with the following versions:

  • cluster-api-provider-hetzner version: v1.0.0-beta.25
  • Kubernetes version: 1.28.2
  • OS (e.g. from /etc/os-release): debian 12

Log:

{"level":"ERROR","time":"2023-10-06T17:33:42.789Z","file":"controller/controller.go:324","message":"Reconciler error","controller":"hetznerbaremetalhost","controllerGroup":"infrastructure.cluster.x-k8s.io","controllerKind":"HetznerBareMetalHost","HetznerBareMetalHost":{"name":"bm-arm-01","namespace":"default"},"namespace":"default","name":"bm-arm-01","reconcileID":"3f8d7595-bc7f-4174-97d0-b7b49efbc96d","error":"failed to reconcile HetznerBareMetalHost default/bm-arm-01: action \"registering\" failed: failed to get hardware details: failed to obtain hardware details Nics: failed to unmarshal {\"name\":\"eth0\",\"model\":\"Intel Corporation I350 Gigabit Network Connection (rev 01)}. Original ssh output name=\"eth0\" model=\"Intel Corporation I350 Gigabit Network Connection (rev 01)\nIntel Corporation I350 Gigabit Network Connection (rev 01)\" mac=\"88:88:88:88:88:88\" ip=\"111.111.111.11/26\" speedMbps=\"1000\"\nname=\"eth0\" model=\"Intel Corporation I350 Gigabit Network Connection (rev 01)\nIntel Corporation I350 Gigabit Network Connection (rev 01)\" mac=\"88:88:88:88:88:88\" ip=\"2a01:2a01:2a01:2a01::2/64\" speedMbps=\"1000\": unexpected end of JSON input","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/src/cluster-api-provider-hetzner/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:324\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/src/cluster-api-provider-hetzner/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:265\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/src/cluster-api-provider-hetzner/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:226"}

benedikt-bartscher avatar Oct 06 '23 17:10 benedikt-bartscher

Above log line, pretty printed:



❯ xclip -o | yq -P
level: ERROR
time: "2023-10-06T17:33:42.789Z"
file: controller/controller.go:324
message: Reconciler error
controller: hetznerbaremetalhost
controllerGroup: infrastructure.cluster.x-k8s.io
controllerKind: HetznerBareMetalHost
HetznerBareMetalHost:
  name: bm-arm-01
  namespace: default
namespace: default
name: bm-arm-01
reconcileID: 3f8d7595-bc7f-4174-97d0-b7b49efbc96d
error: |-
  failed to reconcile HetznerBareMetalHost default/bm-arm-01: action "registering" failed:
   failed to get hardware details: failed to obtain hardware details Nics: 
   failed to unmarshal {"name":"eth0","model":"Intel Corporation I350 Gigabit Network Connection (rev 01)}. 
   Original ssh output name="eth0" model="Intel Corporation I350 Gigabit Network Connection (rev 01)
  Intel Corporation I350 Gigabit Network Connection (rev 01)" mac="88:88:88:88:88:88" ip="111.111.111.11/26" speedMbps="1000"
  name="eth0" model="Intel Corporation I350 Gigabit Network Connection (rev 01)
  Intel Corporation I350 Gigabit Network Connection (rev 01)" mac="88:88:88:88:88:88" ip="2a01:2a01:2a01:2a01::2/64"
   speedMbps="1000": unexpected end of JSON input
stacktrace: |-
  sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
        /src/cluster-api-provider-hetzner/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:324
  sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
        /src/cluster-api-provider-hetzner/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:265
  sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
        /src/cluster-api-provider-hetzner/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:226

guettli avatar Oct 09 '23 07:10 guettli

Is there any work in progress regarding this issue? If not, would you accept a pull request?

benedikt-bartscher avatar Nov 16 '23 18:11 benedikt-bartscher

@benedikt-bartscher yes, a PR is welcome. BTW, do you have an idea how to reproduce this error? Is there a way to create a second (fake) network interface somehow? Then I could validate you PR manually.

guettli avatar Nov 17 '23 14:11 guettli

Hey @guettli thanks for your response. You could alias lspci to echo some test data. I am not aware of any other trick which results in a "fake" NIC appearing in lspci/ethtool. Aren't your e2e tests sponsored by Hetzner? Maybe they can provide you a server with 2 NICs. If not, I could provide you with one of our machines for some coding/testing for free.

benedikt-bartscher avatar Nov 18 '23 00:11 benedikt-bartscher

They will, no problem. If you can open a PR, we will be able to test it as well!

janiskemper avatar Nov 20 '23 07:11 janiskemper

Hello guys.

Unfortunately, we encountered the same issue while deploying a Kubernetes cluster on baremetal servers from Hetzner with the cluster-api-provider-hetzner.

We have a server of type AX41-NVMe with a single network interface, and the technical details of the server are successfully obtained, and the subsequent bootstrap completes successfully.

However, we also have different servers of types EX130-R/EX130-S, which have two network interfaces:

root@rescue ~ # lspci | grep net | awk '{$1=$2=$3=""; print $0}' | sed "s/^[ \t]*//"
Intel Corporation Ethernet Controller X550 (rev 01)
Intel Corporation Ethernet Controller X550 (rev 01)

Similar to the example @alexkasatikov we have logs from caph-controller-manager:

{
  "level": "ERROR",
  "time": "2024-03-10T19:02:52.056Z",
  "file": "controller/controller.go:329",
  "message": "Reconciler error",
  "controller": "hetznerbaremetalhost",
  "controllerGroup": "infrastructure.cluster.x-k8s.io",
  "controllerKind": "HetznerBareMetalHost",
  "HetznerBareMetalHost": {
    "name": "infra-dev-02-worker-bm-2332683",
    "namespace": "default"
  },
  "namespace": "default",
  "name": "infra-dev-02-worker-bm-2332683",
  "reconcileID": "5b2d4c5a-f010-42df-8532-8c1388861c86",
  "error": "failed to reconcile HetznerBareMetalHost default/infra-dev-02-worker-bm-2332683: action
   \"registering\" failed: failed to get hardware details: failed to obtain hardware details Nics: failed to 
   unmarshal {\"name\":\"eth0\",\"model\":\"Intel Corporation Ethernet Controller X550 (rev 01)}. Original ssh 
   output name=\"eth0\" model=\"Intel Corporation Ethernet Controller X550 (rev 01)\\nIntel Corporation 
   Ethernet Controller X550 (rev 01)\" mac=\"a8:a1:59:fb:c4:db\" ip=\"37.27.63.175/26\" 
   speedMbps=\"1000\"\\nname=\"eth0\" model=\"Intel Corporation Ethernet Controller X550 (rev 01)\\nIntel 
   Corporation Ethernet Controller X550 (rev 01)\" mac=\"a8:a1:59:fb:c4:db\" ip=\"2a01:4f9:3081:310e::2/64\" 
   speedMbps=\"1000\": unexpected end of JSON input",
  
  "stacktrace": "sigs.k8s.io/controller-runtime/pkg/internal/controller.
  (*Controller).reconcileHandler\\n\\tsigs.k8s.io/controller-
  [email protected]/pkg/internal/controller/controller.go:329\\nsigs.k8s.io/controller-
  runtime/pkg/internal/controller.(*Controller).processNextWorkItem\\n\\tsigs.k8s.io/controller-
  [email protected]/pkg/internal/controller/controller.go:266\\nsigs.k8s.io/controller-
  runtime/pkg/internal/controller.(*Controller).Start.func2.2\\n\\tsigs.k8s.io/controller-
  [email protected]/pkg/internal/controller/controller.go:227"
}

This turned out to be a significant issue for us, as our production cluster building process encountered this problem. We would greatly appreciate it if you could find a way to fix this problem.

Environment:

  • cluster-api-provider-hetzner version: v1.0.0-beta.30
  • Kubernetes version: 1.29.1

Lenikas avatar Mar 10 '24 19:03 Lenikas

@Lenikas is it possible to schedule a call for further debugging?

@guettli please have a look into this in the upcoming week.

batistein avatar Mar 10 '24 20:03 batistein

@Lenikas can you please post the output of these commands:

ip a
ethtool "*"
lspci

thank you!

guettli avatar Mar 11 '24 09:03 guettli

@Lenikas can you please post the output of these commands:

ip a
ethtool "*"
lspci

thank you!

Hello @guettli, thank you for replying!

This is output from server EX130-R type:

ip a:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 9c:6b:00:45:9c:5e brd ff:ff:ff:ff:ff:ff
    altname eno1
    altname enp5s0f0
    inet 37.27.107.180/26 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 2a01:4f9:3070:1e05::2/64 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::9e6b:ff:fe45:9c5e/64 scope link
       valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 9c:6b:00:45:9c:5f brd ff:ff:ff:ff:ff:ff
    altname eno2
    altname enp5s0f1
4: usb0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether ce:b3:36:d9:a6:a2 brd ff:ff:ff:ff:ff:ff
ethtool: "*"
Settings for eth0:
	Supported ports: [ TP ]
	Supported link modes:   100baseT/Full
	                        1000baseT/Full
	                        10000baseT/Full
	                        2500baseT/Full
	                        5000baseT/Full
	Supported pause frame use: Symmetric
	Supports auto-negotiation: Yes
	Supported FEC modes: Not reported
	Advertised link modes:  100baseT/Full
	                        1000baseT/Full
	                        10000baseT/Full
	Advertised pause frame use: Symmetric
	Advertised auto-negotiation: Yes
	Advertised FEC modes: Not reported
	Speed: 1000Mb/s
	Duplex: Full
	Auto-negotiation: on
Settings for eth1:
	Supported ports: [ TP ]
	Supported link modes:   100baseT/Full
	                        1000baseT/Full
	                        10000baseT/Full
	                        2500baseT/Full
	                        5000baseT/Full
	Supported pause frame use: Symmetric
	Supports auto-negotiation: Yes
	Supported FEC modes: Not reported
	Advertised link modes:  100baseT/Full
	                        1000baseT/Full
	                        10000baseT/Full
	Advertised pause frame use: Symmetric
	Advertised auto-negotiation: Yes
	Advertised FEC modes: Not reported
	Speed: Unknown!
	Duplex: Unknown! (255)
	Auto-negotiation: on
Settings for usb0:
	Supported ports: [  ]
	Supported link modes:   Not reported
	Supported pause frame use: No
	Supports auto-negotiation: No
	Supported FEC modes: Not reported
	Advertised link modes:  Not reported
	Advertised pause frame use: No
	Advertised auto-negotiation: No
	Advertised FEC modes: Not reported
	Speed: Unknown!
	Duplex: Half
	Auto-negotiation: off
Settings for eth0:
	Port: Twisted Pair
	PHYAD: 0
	Transceiver: internal
	MDI-X: Unknown
Settings for eth1:
	Port: Twisted Pair
	PHYAD: 0
	Transceiver: internal
	MDI-X: Unknown
Settings for usb0:
	Port: Twisted Pair
	PHYAD: 0
	Transceiver: internal
	MDI-X: Unknown
Settings for eth0:
	Supports Wake-on: umbg
	Wake-on: g
Settings for eth1:
	Supports Wake-on: umbg
	Wake-on: g
Settings for eth0:
        Current message level: 0x00000007 (7)
                               drv probe link
Settings for eth1:
        Current message level: 0x00000007 (7)
                               drv probe link
Settings for usb0:
        Current message level: 0x00000007 (7)
                               drv probe link
Settings for lo:
	Link detected: yes
Settings for eth0:
	Link detected: yes
Settings for eth1:
	Link detected: no
Settings for usb0:
	Link detected: no
lspci:
00:00.0 System peripheral: Intel Corporation Ice Lake Memory Map/VT-d (rev 20)
00:00.1 System peripheral: Intel Corporation Ice Lake Mesh 2 PCIe (rev 20)
00:00.2 System peripheral: Intel Corporation Ice Lake RAS (rev 20)
00:00.4 Generic system peripheral [0807]: Intel Corporation Device 0b23
00:08.0 PCI bridge: Intel Corporation Device 1bb8 (rev 11)
00:0a.0 PCI bridge: Intel Corporation Device 1bba (rev 11)
00:0f.0 PCI bridge: Intel Corporation Device 1bbf (rev 11)
00:14.0 USB controller: Intel Corporation Device 1bcd (rev 11)
00:14.2 RAM memory: Intel Corporation Device 1bce (rev 11)
00:14.4 Host bridge: Intel Corporation Device 1bfe (rev 11)
00:15.0 System peripheral: Intel Corporation Device 1bff (rev 11)
00:16.0 Communication controller: Intel Corporation Device 1be0 (rev 11)
00:16.1 Communication controller: Intel Corporation Device 1be1 (rev 11)
00:16.4 Communication controller: Intel Corporation Device 1be4 (rev 11)
00:17.0 SATA controller: Intel Corporation Device 1ba2 (rev 11)
00:1a.0 PCI bridge: Intel Corporation Device 1bb4 (rev 11)
00:1f.0 ISA bridge: Intel Corporation Device 1b81 (rev 11)
00:1f.4 SMBus: Intel Corporation Device 1bc9 (rev 11)
00:1f.5 Serial bus controller: Intel Corporation Device 1bca (rev 11)
03:00.0 PCI bridge: ASRock Incorporation Device 1150 (rev 06)
04:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 52)
05:00.0 Ethernet controller: Intel Corporation Ethernet Controller X550 (rev 01)
05:00.1 Ethernet controller: Intel Corporation Ethernet Controller X550 (rev 01)
16:00.0 System peripheral: Intel Corporation Ice Lake Memory Map/VT-d (rev 20)
16:00.1 System peripheral: Intel Corporation Ice Lake Mesh 2 PCIe (rev 20)
16:00.2 System peripheral: Intel Corporation Ice Lake RAS (rev 20)
16:00.4 Generic system peripheral [0807]: Intel Corporation Device 0b23
42:00.0 System peripheral: Intel Corporation Ice Lake Memory Map/VT-d (rev 20)
42:00.1 System peripheral: Intel Corporation Ice Lake Mesh 2 PCIe (rev 20)
42:00.2 System peripheral: Intel Corporation Ice Lake RAS (rev 20)
42:00.4 Generic system peripheral [0807]: Intel Corporation Device 0b23
6e:00.0 System peripheral: Intel Corporation Ice Lake Memory Map/VT-d (rev 20)
6e:00.1 System peripheral: Intel Corporation Ice Lake Mesh 2 PCIe (rev 20)
6e:00.2 System peripheral: Intel Corporation Ice Lake RAS (rev 20)
6e:00.4 Generic system peripheral [0807]: Intel Corporation Device 0b23
9a:00.0 System peripheral: Intel Corporation Ice Lake Memory Map/VT-d (rev 20)
9a:00.1 System peripheral: Intel Corporation Ice Lake Mesh 2 PCIe (rev 20)
9a:00.2 System peripheral: Intel Corporation Ice Lake RAS (rev 20)
9a:00.4 Generic system peripheral [0807]: Intel Corporation Device 0b23
c6:00.0 System peripheral: Intel Corporation Ice Lake Memory Map/VT-d (rev 20)
c6:00.1 System peripheral: Intel Corporation Ice Lake Mesh 2 PCIe (rev 20)
c6:00.2 System peripheral: Intel Corporation Ice Lake RAS (rev 20)
c6:00.4 Generic system peripheral [0807]: Intel Corporation Device 0b23
c6:01.0 PCI bridge: Intel Corporation Device 352a (rev 04)
c6:03.0 PCI bridge: Intel Corporation Device 352b (rev 04)
c6:05.0 PCI bridge: Intel Corporation Device 352c (rev 04)
c6:07.0 PCI bridge: Intel Corporation Device 352d (rev 04)
c7:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller PM9A1/PM9A3/980PRO
ca:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller PM9A1/PM9A3/980PRO
f2:00.0 System peripheral: Intel Corporation Ice Lake Memory Map/VT-d (rev 20)
f2:00.1 System peripheral: Intel Corporation Ice Lake Mesh 2 PCIe (rev 20)
f2:00.2 System peripheral: Intel Corporation Ice Lake RAS (rev 20)
f2:00.4 Generic system peripheral [0807]: Intel Corporation Device 0b23
f2:01.0 System peripheral: Intel Corporation Device 0b25
f2:03.0 System peripheral: Intel Corporation Ice Lake MSM
f2:03.1 System peripheral: Intel Corporation Ice Lake PMON MSM
fe:00.0 System peripheral: Intel Corporation Device 3250
fe:00.1 System peripheral: Intel Corporation Device 3251
fe:00.2 System peripheral: Intel Corporation Device 3252
fe:00.3 Host bridge: Intel Corporation Ice Lake IEH
fe:00.5 System peripheral: Intel Corporation Device 3255
fe:05.0 System peripheral: Intel Corporation Device 3245
fe:05.1 System peripheral: Intel Corporation Device 3246
fe:05.2 System peripheral: Intel Corporation Device 3247
fe:06.0 System peripheral: Intel Corporation Device 3245
fe:06.1 System peripheral: Intel Corporation Device 3246
fe:06.2 System peripheral: Intel Corporation Device 3247
fe:07.0 System peripheral: Intel Corporation Device 3245
fe:07.1 System peripheral: Intel Corporation Device 3246
fe:07.2 System peripheral: Intel Corporation Device 3247
fe:0c.0 Performance counters: Intel Corporation Device 324a
fe:0d.0 Performance counters: Intel Corporation Device 324a
fe:0e.0 Performance counters: Intel Corporation Device 324a
fe:0f.0 Performance counters: Intel Corporation Device 324a
fe:1a.0 Performance counters: Intel Corporation Device 2880
fe:1b.0 Performance counters: Intel Corporation Device 2880
fe:1c.0 Performance counters: Intel Corporation Device 2880
fe:1d.0 Performance counters: Intel Corporation Device 2880
ff:00.0 System peripheral: Intel Corporation Device 324c
ff:00.1 System peripheral: Intel Corporation Device 324c
ff:00.2 System peripheral: Intel Corporation Device 324c
ff:00.3 System peripheral: Intel Corporation Device 324c
ff:00.4 System peripheral: Intel Corporation Device 324c
ff:00.5 System peripheral: Intel Corporation Device 324c
ff:00.6 System peripheral: Intel Corporation Device 324c
ff:00.7 System peripheral: Intel Corporation Device 324c
ff:01.0 System peripheral: Intel Corporation Device 324c
ff:01.1 System peripheral: Intel Corporation Device 324c
ff:01.2 System peripheral: Intel Corporation Device 324c
ff:01.3 System peripheral: Intel Corporation Device 324c
ff:01.4 System peripheral: Intel Corporation Device 324c
ff:01.5 System peripheral: Intel Corporation Device 324c
ff:01.6 System peripheral: Intel Corporation Device 324c
ff:01.7 System peripheral: Intel Corporation Device 324c
ff:02.0 System peripheral: Intel Corporation Device 324c
ff:02.1 System peripheral: Intel Corporation Device 324c
ff:02.2 System peripheral: Intel Corporation Device 324c
ff:02.3 System peripheral: Intel Corporation Device 324c
ff:02.4 System peripheral: Intel Corporation Device 324c
ff:02.5 System peripheral: Intel Corporation Device 324c
ff:02.6 System peripheral: Intel Corporation Device 324c
ff:02.7 System peripheral: Intel Corporation Device 324c
ff:0a.0 System peripheral: Intel Corporation Device 324d
ff:0a.1 System peripheral: Intel Corporation Device 324d
ff:0a.2 System peripheral: Intel Corporation Device 324d
ff:0a.3 System peripheral: Intel Corporation Device 324d
ff:0a.4 System peripheral: Intel Corporation Device 324d
ff:0a.5 System peripheral: Intel Corporation Device 324d
ff:0a.6 System peripheral: Intel Corporation Device 324d
ff:0a.7 System peripheral: Intel Corporation Device 324d
ff:0b.0 System peripheral: Intel Corporation Device 324d
ff:0b.1 System peripheral: Intel Corporation Device 324d
ff:0b.2 System peripheral: Intel Corporation Device 324d
ff:0b.3 System peripheral: Intel Corporation Device 324d
ff:0b.4 System peripheral: Intel Corporation Device 324d
ff:0b.5 System peripheral: Intel Corporation Device 324d
ff:0b.6 System peripheral: Intel Corporation Device 324d
ff:0b.7 System peripheral: Intel Corporation Device 324d
ff:0c.0 System peripheral: Intel Corporation Device 324d
ff:0c.1 System peripheral: Intel Corporation Device 324d
ff:0c.2 System peripheral: Intel Corporation Device 324d
ff:0c.3 System peripheral: Intel Corporation Device 324d
ff:0c.4 System peripheral: Intel Corporation Device 324d
ff:0c.5 System peripheral: Intel Corporation Device 324d
ff:0c.6 System peripheral: Intel Corporation Device 324d
ff:0c.7 System peripheral: Intel Corporation Device 324d
ff:1d.0 System peripheral: Intel Corporation Device 344f
ff:1d.1 System peripheral: Intel Corporation Device 3457
ff:1e.0 System peripheral: Intel Corporation Device 3258 (rev 08)
ff:1e.1 System peripheral: Intel Corporation Device 3259 (rev 08)
ff:1e.2 System peripheral: Intel Corporation Device 325a (rev 08)
ff:1e.3 System peripheral: Intel Corporation Device 325b (rev 08)
ff:1e.4 System peripheral: Intel Corporation Device 325c (rev 08)
ff:1e.5 System peripheral: Intel Corporation Device 325d (rev 08)
ff:1e.6 System peripheral: Intel Corporation Device 325e (rev 08)
ff:1e.7 System peripheral: Intel Corporation Device 325f (rev 08)

If it's important, we use custom server versions with various options. If needed, I can probably provide configuration options.

Lenikas avatar Mar 11 '24 10:03 Lenikas

@Lenikas is it possible to schedule a call for further debugging?

@guettli please have a look into this in the upcoming week.

Hello @batistein, @guettli!

If relevant, we can schedule a meeting. Alternatively, we can suggest transitioning our communication to a different platform if it's more convenient for you. Additionally, we can grant you SSH access to the server for debugging purposes.

How long do you think it might take to resolve the issue? It's important for our team to understand this to plan our next steps. Unfortunately, our team lacks sufficient expertise in Go to quickly resolve this issue.

If you need any further information, we're ready to provide it.

Thank you!

Lenikas avatar Mar 11 '24 12:03 Lenikas

@Lenikas please sent me an email at: [email protected]

batistein avatar Mar 11 '24 13:03 batistein

@Lenikas we created a draft which should make the error go away.

Do you need the NIC data which gets gathered by the script? Because at the moment the script nic-info.sh does not work reliably. But I guess you don't need these values, and you just want the provisioning to succeed.

guettli avatar Mar 12 '24 14:03 guettli

@guettli Yes, at the moment, we simply need a fix to ensure that provisioning completes successfully.

However, we are unsure where this information may be needed in the future. Perhaps you have some ideas or is it related to some functionality of the cluster-api-provider-hetzner?

Thank you for the responsive communication!

Lenikas avatar Mar 12 '24 14:03 Lenikas

@Lenikas the PR is merged, you can test the new caph image by updating the caph deployment in your management cluster.

Image: ghcr.io/syself/caph-staging:sha-c6fd5bb

See: https://github.com/syself/cluster-api-provider-hetzner/pkgs/container/caph-staging/190282019?tag=sha-c6fd5bb

Please tell us if this works for you. Thank you.

guettli avatar Mar 13 '24 08:03 guettli

@Lenikas we just released a new version of caph. Should be now usable with clusterctl as well.

batistein avatar Mar 13 '24 22:03 batistein

@guettli Hello I apologize for the delayed response.

Yes, I have checked the built image, it works. The provisioning completes successfully, and the nodes are added to the cluster.

Thank you so much!

Lenikas avatar Mar 14 '24 21:03 Lenikas