python icon indicating copy to clipboard operation
python copied to clipboard

Weird behavior when editing directly `client.V1Job`

Open Uranium2 opened this issue 2 years ago • 3 comments

I made a framework that creates a default template to deploy a Job with client.V1Job. I've made a class to wrap my K8s Job Object. Then I've added some methods to edit the template once the class is initialized. But when I'm editing the values, I don't get what I want when deployed.

class KubernetesJob:
    def __init__(self, job_name: str = None) -> None:
        """K8s Job class. Holds configuration, can be deployed or deleted.

        Args:
            job_name (str): Job name or Id. Defaults to None.
        """
        if job_name is None:
            error_msg = "job_name is not specified."
            logging.error(error_msg)
            raise TypeError(error_msg)
        if not isinstance(job_name, str):
            error_msg = "job_name must be strings."
            logging.error(error_msg)
            raise ValueError(error_msg)

        self.default_cmd = "source ~/.bashrc \
                && cd /app/ \
                && . /opt/conda/etc/profile.d/conda.sh \
                && conda activate ${CONDA_ENVIRONMENT} \
                && kinit -kt ${KEYTAB_PATH} ${SERVICE_USER} "
        self.job_name = job_name
        self.namespace = settings.NAMESPACE
        self.config = self._create_config()

    def _create_config(self):
        """Create a Job object configuration using `kubernetes` objects. It is basically a YML file at the end but in object form.
        Here we need to do everything, from image source, secrets, env variable, size of job, entrypoint, volumes, secrets.

        Returns:
            client.V1Job: YML representation of a job in K8s
        """
        # Configure Pod template ressource
        ressources = client.V1ResourceRequirements(
            requests={"cpu": 0.1, "memory": "0.1Mi"},
            limits={"cpu": 0.2, "memory": "0.2Mi"},
        )
       # Lot of other stuff to build the config
       #
       #
       # Create the specification of deployment
       spec = client.V1JobSpec(template=template, backoff_limit=4)
       # Instantiate the job object
       job = client.V1Job(
           api_version="batch/v1",
           kind="Job",
           metadata=client.V1ObjectMeta(name=self.job_name),
           spec=spec,
       )
       return job

Everything here works correclty, the memory request and limit is correct when I print config and deploy the job on the K8s Cluster. But when I change its values direclty in the object, sometimes I get weird behaviors. Where the new given request/limit is converted to TeraBytes.


    def update_ressources_limits(
        self, cpu_limit: float = 0.2, memory_limit: float = 0.2
    ) -> bool:
        """Update Ressource limits of Job.

        Args:
            cpu_limit (float): Number of cpu for the Job
            memory_limit (float): Number of memory for the job in Mi.

        Returns:
            bool : Config updated else failed to update
        """
        if (
            not isinstance(cpu_limit, float)
            or not isinstance(memory_limit, float)
            or cpu_limit <= 0.0
            or memory_limit <= 0.0
        ):
            logging.error(
                "cpu_limit and memory_limit must be float and greater than 0 : cpu_limit %s, memory_limit %s",
                cpu_limit,
                memory_limit,
            )
            return False
        self.config.spec.template.spec.containers[0].resources.limits["cpu"] = cpu_limit
        self.config.spec.template.spec.containers[0].resources.limits[
            "memory"
        ] = f"{memory_limit}Mi"
        return True

I also have a update_ressources_requests method that do the same but changed with self.config.spec.template.spec.containers[0].resources.requests["cpu"] = cpu_request

I'm making sure that I have a positive float when using this method.

k8s_job = KubernetesJob(job_name=id)

k8s_job.update_ressources_requests(0.15, 0.15) # 0.15Mi => 157286400m
k8s_job.update_ressources_limits(0.18, 0.18) #  0.18Mi => 188743680m

      resources:
          limits:
            cpu: 180m
            memory: 188743680m
          requests:
            cpu: 150m
            memory: 157286400m

When I change Mi to Gi only 0.18Gi went crazy and gave me TerraBytes. But if I remove the i and use M or G I always get the desired values in my deployment.

I did not except that changing the Object (json/dict) directly could lead to such weird behaviors.

Environment:

  • Ubuntu 18.4
  • Python 3.6.4
  • kubernetes==23.3.0

Uranium2 avatar Apr 21 '22 07:04 Uranium2

Could you verify if kubectl gives you the expected behavior? We're wondering if this could be changed by the server.

roycaihw avatar Apr 25 '22 16:04 roycaihw

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jul 24 '22 17:07 k8s-triage-robot

After testing. When I export my python template to YAML, it outputs me the correct value 0.15Gi and 0.18Gi.

But when I use kubectl, and I check the status of the job, It asks for 157286400m and 188743680m in requests and limits. So the issue seems to not be related to kubernetes-client in python

https://github.com/kubernetes/kubectl/issues/1250

Uranium2 avatar Jul 25 '22 14:07 Uranium2

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Aug 24 '22 14:08 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-triage-robot avatar Sep 23 '22 15:09 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Sep 23 '22 15:09 k8s-ci-robot