trident
trident copied to clipboard
Inaccurate TargetPortal in describe pv output with ONTAP SAN serving multiple networks
Describe the bug
describe pv
can show inaccurate TargetPortal with ONTAP SAN (iSCSI) which has data interfaces both reachable and unreachable networks. For example, if iSCSI Data LIFs are visible on 103.0/24 and 105.0/24 and I login from the latter network, describe pv
will show that I'm using a Target from 103.0/24.
Maybe that's because Targets are sorted in ascending order and only the first Target is shown, but that's still wrong if I don't have any interfaces on that network.
Environment
- Trident 21.01.1
- CentOS 7.9
- Container runtime: Docker 1.13.1
- Kubernetes orchestrator: OpenShift v3.11 (free)
To Reproduce
- Setup an SVM with iSCSI Data LIFs on 2 VLANs (I created 4 iSCSI Data LIFs, 1 per controller and VLAN so I end up with 2 for network 103.0/24 and 2 for network 105.0/24)
- Setup SC for iSCSI and ONTAP SAN backend
- From a worker on one of these networks/VLANs (e.g. VLAN 105, iSCSI network 192.168.105.0/24), when
iscsiadm
discovery is done, ONTAP reports 4 targets, iSCSI client attempts to log in to all 4, and succeeds with the two that it can access (those on 105.0/24, in my case, as it is on that VLAN). I know this is a separate topic, so let's just ignore that 103.0/24 is not reachable from this worker - worker will connect to the two target IPs on the network 105.0/24 - Problem:
describe pv
shows inaccurateTargetPortal
in output below:
$ oc describe pv default-postgresql-data-be191
Name: default-postgresql-data-be191
Labels: <none>
Annotations: pv.kubernetes.io/provisioned-by=netapp.io/trident
volume.beta.kubernetes.io/storage-class=ontap-iscsi
Finalizers: [kubernetes.io/pv-protection]
StorageClass: ontap-iscsi
Status: Bound
Claim: default/postgresql-data
Reclaim Policy: Delete
Access Modes: RWO
Capacity: 1Gi
Node Affinity: <none>
Message:
Source:
Type: ISCSI (an ISCSI Disk resource that is attached to a kubelet's host machine and then exposed to the pod)
TargetPortal: 192.168.103.159 <=================== THIS =========================
IQN: iqn.1992-08.com.netapp:sn.cfbbbeea862911eb895f005056a99e8f:vs.15
Lun: 2
ISCSIInterface default
FSType: xfs
ReadOnly: false
Portals: [192.168.103.59 192.168.105.159 192.168.105.59]
DiscoveryCHAPAuth: false
SessionCHAPAuth: false
SecretRef: <nil>
InitiatorName: <none>
Events: <none>
- Not only am I not using that Portal, I don't even have any interfaces on that network, I'm connected from 105.0/24 and can't ping Data LIFs from 103.0/24.
$ netstat -ant | grep 103
tcp 0 0 127.0.0.1:39103 0.0.0.0:* LISTEN
$ netstat -ant | grep 105
tcp 0 0 192.168.105.100:53 0.0.0.0:* LISTEN
tcp 0 0 192.168.105.100:60524 192.168.105.59:3260 ESTABLISHED
tcp 0 0 192.168.105.100:45898 192.168.105.159:3260 ESTABLISHED
$ ping 192.168.103.59
PING 192.168.103.59 (192.168.103.59) 56(84) bytes of data.
^C
--- 192.168.103.59 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 1999ms
- I know that
Portals
inoc describe pv
(which confusingly contains all 4 Target IPs, but at least that is correct albeit incomplete - showing 3 out of 4) comes fromiscsiadm
. ButTargetPortal
could show info fromiscsiadm -m session
(output below) which is accurate and doesn't have the misleading network 103.0/24:
$ sudo iscsiadm -m session
tcp: [1] 192.168.105.59:3260,1039 iqn.1992-08.com.netapp:sn.cfbbbeea862911eb895f005056a99e8f:vs.15 (non-flash)
tcp: [2] 192.168.105.159:3260,1040 iqn.1992-08.com.netapp:sn.cfbbbeea862911eb895f005056a99e8f:vs.15 (non-flash)
I don't know where TargetPortal
in oc describe pv
output comes from (some generic OS or CSI API or Trident), but if it's up to Trident I hope that information can be accurate.
Expected behavior
TargetPortal
in describe pv
output for iSCSI is accurate (at least show the correct network, even if just one IP of possibly several is shown).