kubedock icon indicating copy to clipboard operation
kubedock copied to clipboard

Copying a file to a container before starting it

Open rcgeorge23 opened this issue 2 years ago • 11 comments
trafficstars

Hi @joyrex2001,

We have a container that we would like to copy some files to before starting it (debezium)

We were using testcontainer's withCopyToContainer to copy these files in, which worked locally but did not work in k8s with kubedock. I found this thread where the same issue was being discussed:

https://github.com/joyrex2001/kubedock/issues/1

So I updated our test to use withFileSystemBind instead, and while this worked locally it unfortunately didn't seem to work in k8s either.

While investigating the issue, I notice that the debezium connector pod has a kubedock sidecar:

 NAME↑        PF       IMAGE                                                               READY        STATE                   INIT               RESTARTS PROBES(L:R)             CPU       MEM       CPU/R:L       MEM/R:L       %CPU/R       %CPU/L        %MEM/R        %MEM/L PORTS                    AGE            │
 main         ●        eu.gcr.io/my-company/docker.io/debezium/connect:1.9.6.Final         false        PodInitializing         false                     0 off:off                   0         0        1000:0        1000:0            0          n/a             0           n/a kd-tcp-8083:8083         6m32s          │
 setup        ●        joyrex2001/kubedock:0.10.0                                          false        ImagePullBackOff        true                      0 off:off                   0         0          10:0         128:0            0          n/a             0           n/a                          6m32s          │
                                                                                                                                                                                                                                                                                                                            

It looks like this is to do with the way the configmap is created.

As you can see the image cannot be pulled -- this is because <my company> is quite a large bank, so we very quickly get rate limited by docker hub when trying to pull images (all traffic from our infrastructure comes from a small range of IP addresses from the wider internet's perspective).

Our solution to being rate limited is to cache images from docker hub in our own image repo (eu.gcr.io/my-company/...), and reference those ones instead of the public ones from docker hub. Is it possible to tell kubedock to use the cached GCR kubedock image when it spins up the sidecar instead of the docker hub one?

Thanks!

rcgeorge23 avatar Oct 19 '23 08:10 rcgeorge23

The --initimage argument allows you to override the image that is used for the init-container. In your case, it should probably be something like --initimage eu.gcr.io/my-company/docker.io/joyrex2001/kubedock:0.10.0, assuming you pushed the image to that registry as well. Note that in this case, any image with tar available should work (as it's only used for copying files to a shared volume).

joyrex2001 avatar Oct 19 '23 17:10 joyrex2001

Thanks, will give this a try tomorrow

rcgeorge23 avatar Oct 19 '23 19:10 rcgeorge23

Good news is that --initimage works as exepected, I can now see that we're successfully pulling our cached kubedock image, however I don't see the configmap being created.

I also can't see the kubedock sidecar logging anything.

Any suggestions for debugging this? I notice that kubedock seems to use logback -- I am not very familiar with go, but perhaps I can add an environment variable to enable debug logging?

rcgeorge23 avatar Oct 21 '23 07:10 rcgeorge23

When withFileSystemBind is used, kubedock will start an init-container (not a sidecar). It will copy the contents to a volume that is shared with the main container, so once the main container is started, the desired contents are available.

You can increase debugging by adding a --verbose <level> argument (or -v as short alternative) when kubedock is started (unfortunately, it's not available as environment variable). Increasing the level to 2, 3 or even 5 will give more verbose logging of what kubedock is doing under the hood. In https://github.com/joyrex2001/kubedock/commit/bc27ff5f72947bfdb91d24a7dca27d3bcf1c580d I enabled setting the verbosity level via an environment variable VERBOSITY as well.

What might go wrong is that kubedock can't access the files that you want to copy over; e.g. if kubedock is running in a sidecar of some pipeline, you have to make sure that the whatever you want to copy over, is also available on that exact same location in the sidecar. In the tekton-example you can see it mounts the source in the sidecar as well, ensuring kubedock can access whatever is required if something needs to be copied over.

joyrex2001 avatar Oct 21 '23 10:10 joyrex2001

Some testcontainers also annoyingly try to copy before starting e.g.

  • https://github.com/testcontainers/testcontainers-java/blob/main/modules/kafka/src/main/java/org/testcontainers/containers/KafkaContainer.java#L199
  • https://github.com/testcontainers/testcontainers-dotnet/blob/e3be24f8256c21be4525e8abae623695c8bb7fb2/src/Testcontainers.Kafka/KafkaBuilder.cs#L87
  • https://github.com/testcontainers/testcontainers-dotnet/blob/e3be24f8256c21be4525e8abae623695c8bb7fb2/src/Testcontainers.Redpanda/RedpandaBuilder.cs#L64

I'll try rewriting them to use withFileSystemBind instead but I wonder if kubedock could somehow do that behind the scenes when asked to copy to a container that hasn't been started? I guess kubedock starts the pod when it gets a docker create command so no?

mausch avatar Dec 19 '23 08:12 mausch

One thing you can try is to start kubedock with --pre-archive which will make a configmap for all files that are copied before the container is started.

Looking at the redpanda one; that seems to wait before starting the container until the file is actually present, which is a pattern that should work.

https://github.com/testcontainers/testcontainers-dotnet/blob/e3be24f8256c21be4525e8abae623695c8bb7fb2/src/Testcontainers.Redpanda/RedpandaBuilder.cs#L52C1-L53C1

joyrex2001 avatar Dec 19 '23 17:12 joyrex2001

Thanks, --pre-archive seems to be exactly what I need. Unfortunately it doesn't seem to work for the confluentinc/cp-kafka:7.5.1 image... In the kubedock logs I see:

1220 10:55:38.422757   69123 copy.go:30] copy archive to 5d04992d0f47:/
I1220 10:55:38.422964   69123 exec.go:59] exec kubedock-elevate-app-tests-kafka-5d04992d0f47:[tar -xf - -C /]
E1220 10:55:38.527455   69123 util.go:17] error during request[500]: command terminated with exit code 2

I see that's what kubedock does to unpack the files within the container: https://github.com/joyrex2001/kubedock/blob/5b305662ad3cbe45beb8d0f73f063327d27e1e47/internal/backend/copy.go#L37

If I exec into the container and run that manually (obviously I don't have the actual stdin input though):

$ tar -xf - -C /
tar: Refusing to read archive contents from terminal (missing -f option?)
tar: Error is not recoverable: exiting now
$ echo $?
2
$ tar --version
tar (GNU tar) 1.30

Also I don't see any configmaps (unless kubedock deletes them right after untar?)

mausch avatar Dec 20 '23 11:12 mausch

Hi, Any solution? I have the same problem.

jkurek1 avatar Feb 06 '24 13:02 jkurek1

The error is caused because the user that is execution the tar command in the kafka pod (appuser, uid 1000) does not have permissions to write to /. This causes tar to exit with error code 2.

Unfortunately, this is not something that can be fixed in kubedock. The configuration should be copied to another location instead, which requires a change (or maybe custom) kafka testcontainer.

joyrex2001 avatar Feb 07 '24 16:02 joyrex2001

So far I missed this ticket, ending up with this workaround:

withCreateContainerCmdModifier(cmd -> {
  // force "root" user, so that the STARTER_SCRIPT written to / is then ran by "root" as "appuser"
  cmd.withUser("0");
});
/*
 * Override the default command injecting "su appuser -c"
 * in order to work with kubedock, where the copy of STARTER_SCRIPT to /
 * needs "root" permissions:
 * while original docker api operates as root for "docker copy" command
 * its kubedock counterpart (tar command) leverages the user of the container.
 * OTOH the startup scripts are meant for "appuser".
 */
setCommand("-c", "while [ ! -f " + STARTER_SCRIPT + " ]; do sleep 0.1; done; su appuser -c " + STARTER_SCRIPT);

where STARTER_SCRIPT value is copied from the original (private) constant.

I really don't like it, but at least it works for confluentinc/cp-kafka:7.0.9 both on microk8s and plain docker, along with testcontainers 1.19.5 and 1.19.1.

That said, I'm still looking for a better way to get it working.

davidecavestro avatar Feb 21 '24 09:02 davidecavestro

Thank you @davidecavestro for your temporary solution. I could adopt your approach but had difficulties to understand some parts until I could figure it out.

To make it more clear for others, the content of the variable STARTER_SCRIPT refers to: https://github.com/testcontainers/testcontainers-java/blob/main/modules/kafka/src/main/java/org/testcontainers/containers/KafkaContainer.java#L42 which has currently the value /testcontainers_start.sh in it.

Furthermore, your code above:

setCommand("-c", "while [ ! -f " + STARTER_SCRIPT + " ]; do sleep 0.1; done; su appuser -c " + STARTER_SCRIPT);

didn't work for me. I had to do it slightly different with the bash executable as the first parameter:

setCommand("/bin/sh", "-c", "while [ ! -f " + STARTER_SCRIPT + " ]; do sleep 0.1; done; su appuser -c " + STARTER_SCRIPT);

Pallau avatar Apr 11 '24 19:04 Pallau