zos icon indicating copy to clipboard operation
zos copied to clipboard

kubernetes vm with same network already exist

Open majjihari opened this issue 5 years ago • 4 comments

Description

kubernetes vm with same network already exist

Version information

Ubuntu 18.04

Installation method

js-sdk-development (default)

Steps to reproduce

While creating the k8s cluster on new network. I think the issue is created because of existing old workloads with state ERROR and nextAction set to DELETE. The expected behavior of these workloads is State=DELETED.

JS-NG> for workload in j.sals.zos.get().workloads.list(1500,NextAction.DELETE): 2 print("id=",workload.id,", state=",workload.info.result.state, ", nextAction=", workload.info.next_action)
id= 489109 , state= State.Error , nextAction= NextAction.DELETE id= 500611 , state= State.Error , nextAction= NextAction.DELETE id= 504092 , state= State.Error , nextAction= NextAction.DELETE id= 504105 , state= State.Error , nextAction= NextAction.DELETE id= 504106 , state= State.Error , nextAction= NextAction.DELETE id= 504129 , state= State.Error , nextAction= NextAction.DELETE id= 504150 , state= State.Error , nextAction= NextAction.DELETE id= 504151 , state= State.Error , nextAction= NextAction.DELETE id= 504153 , state= State.Error , nextAction= NextAction.DELETE id= 504174 , state= State.Error , nextAction= NextAction.DELETE id= 504176 , state= State.Error , nextAction= NextAction.DELETE id= 504208 , state= State.Error , nextAction= NextAction.DELETE id= 504209 , state= State.Error , nextAction= NextAction.DELETE id= 504283 , state= State.Error , nextAction= NextAction.DELETE id= 504296 , state= State.Error , nextAction= NextAction.DELETE id= 504343 , state= State.Error , nextAction= NextAction.DELETE id= 504344 , state= State.Error , nextAction= NextAction.DELETE id= 504510 , state= State.Error , nextAction= NextAction.DELETE id= 504512 , state= State.Error , nextAction= NextAction.DELETE id= 504513 , state= State.Error , nextAction= NextAction.DELETE

Traceback/Logs/Alerts

https://explorer.testnet.grid.tf/api/v1/reservations/workloads/509017

majjihari avatar Nov 23 '20 14:11 majjihari

seems nodes are not deleting these workloads

m-motawea avatar Nov 23 '20 15:11 m-motawea

I'm reopening, This is not normal that these workloads are not deleted.

zaibon avatar Nov 23 '20 19:11 zaibon

The cleaner (janitor) in provision engine should also make sure to check vm reservations that are not in the "deployed" state on the explorer and deprovision those as well.

muhamadazmy avatar Nov 27 '20 07:11 muhamadazmy

Waiting until new provisiond code is merged so we can tackle this easier

DylanVerstraete avatar Dec 01 '20 14:12 DylanVerstraete