peridot
peridot copied to clipboard
Incorrect behavior in k8s.bash?
Describe The Bug
Hello everyone,
I am currently trying to set up peridot on a multi-node kubernetes cluster, but I'm stuck where the instructions say to execute hack/setup_base_internal_services
.
The output of the command is something like the following.
[...]
parse error: Invalid literal at line 1, column 13
Error from server (NotFound): namespaces "registry-secret" not found
error: no objects passed to apply
Error from server (BadRequest): error when creating "hydra/deploy/public/003-deployment.yaml": Deployment in version "v1" cannot be handled as a Deployment: strict decoding error: unknown field "spec.template.spec.containers[0].ports[0].expose", unknown field "spec.template.spec.containers[0].ports[0].external", unknown field "spec.template.spec.containers[0].ports[1].expose", unknown field "spec.template.spec.containers[0].ports[1].external"
[...]
What caught my eye is that the parse error looks a lot like something jq or yq would print if they parse something that's not JSON or YAML, so I dug a bit deeper into the script. It seems that the output is coming from rules_resf/internal/k8s/k8s.bash
, which in turn is executed by the first bazel command, bazel run --platforms @io_bazel_rules_go//go/toolchain:linux_"$ARCH" //hydra/deploy/public:public.apply
.
The problematic pipe is the following.
COPY_TO_NS=$(echo "{$(cat ${i} | grep "namespace" | head -n 1)}" | jq -r '.namespace' | tr -d '\n')
The value of $i
is the path of one of the four YAML files in bazel-bin/hydra/deploy/public
, and I'm guessing the call is attempting to parse the namespace from the YAML files. Now, grepping for "namespace"
in any of those files will likely return a line like
namespace: "foobar"
which is not valid JSON, so the jq call could not possibly succeed. I simplified the command and changed it to use yq instead, which seems to solve at least one of the problems (there should also be a cleaner solution that does not need grep).
COPY_TO_NS=$(grep -m 1 "namespace:" "$i" | yq -r '.namespace')
However, even with that line fixed, the script does not succeed because it cannot query a secret from kubectl. The problematic line is the following.
kubectl -n "registry-secret${STABLE_STAGE}" get secret registry -o json | jq ".metadata.namespace=\"${COPY_TO_NS}\"" | kubectl apply --force -f -
This command attempts to fetch the secret called registry from a namespace whose name starts with registry-secret. There is no such namespace in my cluster, and there is no secret called registry in any of the other namespaces either. I have a secret called mlbuild-secret in the default namespace. Maybe the script is supposed to query this secret instead? My username is mlbuild, and there is also a namespace called mlbuild-dev, so this would make sense. On the other hand I can't rule out that the namespaces and secrets in my cluster haven't been set up correctly. Could anybody please shed some light on this?
Thank you!
Reproduction Steps
- Set up a kubernetes cluster
- Follow the installation instructions until the step where it says to execute
hack/setup_base_internal_services
Expected Behavior
The script completes without errors.
Version and Build Information
HEAD is at 8222ab2f43a330bf200017f9f77205983f46de9c
Additional context
No response
Hi @m10k - Thank you for the report. The setup process is a bit of a pain point right now, but we're working on porting in some changes we use on another project which allow for a single-command setup of the development environment. We're hoping to merge that change in the next couple of months.
However, for now, let's see if we can get your setup running. I think it is complaining that you don't have a secret for hydra. You can create one as follows:
kubectl -n "$USER-dev" create secret generic server --from-literal=hydra-secret="$(export LC_CTYPE=C; cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 32 | head -n 1)" --from-literal=byc-secret="$(export LC_CTYPE=C; cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 32 | head -n 1)"
Hey @NeilHanlon, thank you for your response!
I tried running the command that you posted, but unfortunately setup_base_internal_services still fails with the same error.
I noticed that I already have a secret for hydra in my mlbuild-dev namespace, though. To be honest, I don't quite understand what the script does, but I got the feeling that it is moving secrets from one namespace to another. Is it necessary to copy this secret to the default namespace?
I have the following secrets in mlbuild-dev
mlbuild@k8s:~/peridot$ kubectl -n mlbuild-dev get secrets
NAME TYPE DATA AGE
env Opaque 1 7d17h
hydra Opaque 2 9d
server Opaque 2 23h
And these are in the default namespace
mlbuild@k8s:~/peridot$ kubectl get secrets
NAME TYPE DATA AGE
hydra Opaque 2 21h
minio Opaque 3 9d
mlbuild-secret kubernetes.io/service-account-token 3 10d
postgres-postgresql Opaque 1 9d
sh.helm.release.v1.localstack.v1 helm.sh/release.v1 1 9d
sh.helm.release.v1.localstack.v2 helm.sh/release.v1 1 9d
sh.helm.release.v1.minio.v1 helm.sh/release.v1 1 9d
sh.helm.release.v1.postgres.v1 helm.sh/release.v1 1 9d
sh.helm.release.v1.temporal.v1 helm.sh/release.v1 1 9d
temporal-default-store Opaque 1 9d
temporal-visibility-store Opaque 1 9d
Is there any other information I can provide that might help figure out what's going on?
I'll chime in that I'm hitting this as well, attempting to follow the instructions on working with docker-desktop, with latest top of tree peridot git. Running the command that was suggested in https://github.com/rocky-linux/peridot/issues/73#issuecomment-1332189196 and it's seemingly not getting picked up from the bazel public or deploy steps