vcluster icon indicating copy to clipboard operation
vcluster copied to clipboard

Job names can have characters not allowed in Pod names, so fails when synced.

Open GrahamDumpleton opened this issue 4 years ago • 4 comments

A Kubernetes Job name appears to need to follow DNS Subdomain Names rules.

  • contain no more than 253 characters
  • contain only lowercase alphanumeric characters, '-' or '.'
  • start with an alphanumeric character
  • end with an alphanumeric character

A Kubernetes Pod name when created directly appears to need to follow RFC 1123 Label Names rules.

  • contain at most 63 characters
  • contain only lowercase alphanumeric characters or '-'
  • start with an alphanumeric character
  • end with an alphanumeric character

When a Kubernetes Job is translated by the syncer to a Pod, it uses the original Job name for the Pod name, meaning that it can fail in the translation.

For example, original Job of:

apiVersion: batch/v1
kind: Job
metadata:
  name: my-job.extra
spec:
  template:
    spec:
      containers:
      - command:
        - date
        image: busybox
        name: my-container
      restartPolicy: Never

can be created, but the Pod created fails with:

$ k get events
LAST SEEN   TYPE      REASON             OBJECT                   MESSAGE
4m53s       Normal    SuccessfulCreate   job/my-job.extra         Created pod: my-job.extra-rzkq8
21s         Warning   SyncError          pod/my-job.extra-rzkq8   Error syncing to physical cluster: Pod "my-job.extra-rzkq8-x-default-x-lab-tce-w07-s004-vcluster" is invalid: spec.hostname: Invalid value: "my-job.extra-rzkq8": a lowercase RFC 1123 label must consist of lower case alphanumeric characters or '-', and must start and end with an alphanumeric character (e.g. 'my-name',  or '123-abc', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?')

The syncer should encode the Pod name created from a Job to ensure that characters are valid and name length not exceeded.

Note that it seems that in an ordinary cluster, even though one can't use a "." in a Pod name, a Pod name generated indirectly via a Job can. Because though the syncer does it directly it must follow the more stringent rule.

GrahamDumpleton avatar Sep 13 '21 05:09 GrahamDumpleton

Very important topic. Thanks for opening this issue. I have thought about the length exceeding issue for any synced resource a few times before. Here is one idea I came up with: Potentially we should check if the sycned name would exceed the character limit and then instead of using the usual [name]-x-[vc-namespace]-[vc-name], we could fallback to [sha256-of-name]-x-[vc-namespace]-[vc-name] which would allow for a much higher character limit and would only be an issue when the namespace or vcluster name is very long which seems like an edge case. Long pod names, however, are very very common. I'm a fan of showing the actual name though, so I don't think the hash-solution should be the default but rather a fallback.

LukasGentele avatar Sep 13 '21 05:09 LukasGentele

Name length is definitely a secondary issue in this case, but the primary reason it failed was that pesky ".". :-(

GrahamDumpleton avatar Sep 13 '21 05:09 GrahamDumpleton

Definitely. Just wanted to throw that in there since it may be a good occasion to generally talk about naming strategy for sync and how to address potential issues.

LukasGentele avatar Sep 13 '21 05:09 LukasGentele

I actually thought you already had some strategy for long names where it would truncate them at a certain point and replace just the trailing part of the name with a hash value. Couldn't you use that approach but apply it to just that segment of the generated name. That way could preserve as much of leading part of the name as possible.

GrahamDumpleton avatar Sep 13 '21 06:09 GrahamDumpleton

It seems like this was fixed in #135 - closing.

matskiv avatar Oct 17 '22 11:10 matskiv