seldon-core Operator reconciliation fails when setting fractional milli resource requests

Operator reconciliation fails when setting fractional milli resource requests

Open bcvanmeurs opened this issue 1 year ago • 0 comments

trafficstars

Describe the bug

When setting a resource request of 1.5m or any partial millicpu request, the operator manager fails to reconcile the deployment.

To reproduce

Create a deployment with a custom model container image
Set the resource request to a fractional millicpu
Deploy

The operator manager cannot reconcile, and generates millions of logs per day. I know we are not supposed to set fractional millicpu, but k8s does not fail if you do so. It rounds the value up to nearest integer. I suspect this is where the reconciliation error comes from and the operator sees a difference between the desired state: 1.5m and the actual state 2m. And tries to reconcile, without success, generating many logs.

Expected behaviour

Although we were not supposed to set fractional millicpus, it took a while to find the root cause of the many logs. I expect: either the operator to return an error to disallow fractional millicpu values, or also silently round the value up to the nearest integer, like k8s does by default.

Environment

AWS

Mar 14 '24 15:03 bcvanmeurs

seldon-core seldon-core copied to clipboard

Operator reconciliation fails when setting fractional milli resource requests

Describe the bug

To reproduce

Expected behaviour

Environment

seldon-core
seldon-core copied to clipboard