autoware icon indicating copy to clipboard operation
autoware copied to clipboard

feat(ci,docker): change `vcs import` to `COPY src`

Open youtalk opened this issue 1 year ago • 13 comments

Description

  • [x] #4712

This PR modifies the process by running vcs import on outside the Dockerfile and then copying the src to the Dockerfile. This change resolves the issues https://github.com/autowarefoundation/autoware/pull/4730#issuecomment-2116788839 and ensures that docker build is correctly rerun when the source code is modified.

I checked the Autoware Documentation and found that no documentation changes were necessary due to this modification. https://autowarefoundation.github.io/autoware-documentation/main/installation/autoware/docker-installation/

Tests performed

Not applicable.

Effects on system behavior

Not applicable.

Interface changes

Pre-review checklist for the PR author

The PR author must check the checkboxes below when creating the PR.

In-review checklist for the PR reviewers

The PR reviewers must check the checkboxes below before approval.

Post-review checklist for the PR author

The PR author must check the checkboxes below before merging.

  • [ ] There are no open discussions or they are tracked via tickets.

After all checkboxes are checked, anyone who has write access can merge the PR.

youtalk avatar May 20 '24 06:05 youtalk

@mitsudome-r @oguzkaganozt @xmfcx Please review this.

youtalk avatar May 20 '24 13:05 youtalk

Also after merging this PR, we would need to test docker-build-and-push with this PR's branch then we can actually test out the build mechanism with this new caching. Otherwise both are looking great thanks @youtalk !

oguzkaganozt avatar May 21 '24 11:05 oguzkaganozt

@oguzkaganozt Thank you for your review! I'm checking docker-build-and-push https://github.com/autowarefoundation/autoware/actions/runs/9174270335.

youtalk avatar May 21 '24 11:05 youtalk

no-cuda was success but cuda was failed.

https://github.com/autowarefoundation/autoware/actions/runs/9174270335/job/25224777460#step:7:16258

ERROR: failed to solve: failed to copy to tar: rpc error: code = Unknown desc = write /tmp/devel.tar: no space left on device

I think it was probably exceeding the disk size when saving the container image as a tar file. I'll consider countermeasures.

youtalk avatar May 22 '24 00:05 youtalk

I hope the problem would be fixed. https://github.com/autowarefoundation/autoware/pull/4738/commits/6a9398d3cf43100aea729c4a3eed61147ecbf9ed I'm checking docker-build-and-push again. https://github.com/autowarefoundation/autoware/actions/runs/9183845791

@oguzkaganozt By the way, why is the Upload Artifact step necessary to verify the docker build? It seems sufficient if Build and Save Artifacts is successful. Since more than 10 GB needs to be uploaded, it also seems to be taking time to transfer.

youtalk avatar May 22 '24 00:05 youtalk

LGTM, but we need to change the documentation of building containers from scratch section because the user would need to dovcs import before building the containers after merging this PR.

This is exactly why you should not merge this PR. The whole idea of containers is to reduce dependencies to streamline build and deployment. The user must be able to build using standard container tooling only.

Please do not introduce more technical debt for nothing.

doganulus avatar May 22 '24 07:05 doganulus

@oguzkaganozt

Please do not introduce more technical debt for nothing.

The key motivation behind this PR is that @youtalk is currently seeking for a method to fasten Docker build process so that we can utilize docker images for developing latest features of Autoware. He's trying to use docker build cache, but since it won't detect the change in the code retrieved from vcs import, he proposed to introduce this workaround.

Therefore, there is a trade-off between docker build time and complexity of docker build procedure at the moment. If we just think vcs import as steps for downloading Autoware source code, I don't think it's that unnatural to ask users to download the necessary codes before running docker build.

Alternatively, we could use Git submodule to simplify the command itself, but that would make a too big of a impact at this stage compared to the benefits we get.

@doganulus If you can think of any good approach, could you share us your idea?

mitsudome-r avatar May 22 '24 08:05 mitsudome-r

Hi @mitsudome-r, I have shared a snippet of the solution here. It might be missed in a merged PR: https://github.com/autowarefoundation/autoware/pull/4712#issuecomment-2121881833

Regarding this comment:

Therefore, there is a trade-off between docker build time and complexity of docker build procedure at the moment. If we just think vcs import as steps for downloading Autoware source code, I don't think it's that unnatural to ask users to download the necessary codes before running docker build.

I think otherwise. Autoware should not ask users to install some packages too easily. As a user and developer, I don't want Autoware to mess with my local environment. Hence, the development environment must stay solely inside containers. This PR goes the opposite. That's the whole point.

I oppose setup-dev-env.sh or similar opaque scripts for the same reason. Please evaluate twice when asking users to install things. The same goes for package dependencies. I hope you see what bothers me but it requires an organization-wide understanding rather than individual efforts. I would appreciate it if Autoware became more strict about these matters.

doganulus avatar May 22 '24 10:05 doganulus

I hope the problem would be fixed. 6a9398d I'm checking docker-build-and-push again. https://github.com/autowarefoundation/autoware/actions/runs/9183845791

@oguzkaganozt By the way, why is the Upload Artifact step necessary to verify the docker build? It seems sufficient if Build and Save Artifacts is successful. Since more than 10 GB needs to be uploaded, it also seems to be taking time to transfer.

Uploading of images is necessary because you can test out the produced images without pushing them into the registry. Also it will be the part of whole CI-CD loop in the future by uploading the images into scenario cloud then from there we would test those images with pre-defined scenario simulations.

oguzkaganozt avatar May 22 '24 13:05 oguzkaganozt

@youtalk can rebase your branch with the latest main branch ?

oguzkaganozt avatar May 22 '24 13:05 oguzkaganozt

Hi @mitsudome-r, I have shared a snippet of the solution here. It might be missed in a merged PR: #4712 (comment)

Regarding this comment:

Therefore, there is a trade-off between docker build time and complexity of docker build procedure at the moment. If we just think vcs import as steps for downloading Autoware source code, I don't think it's that unnatural to ask users to download the necessary codes before running docker build.

I think otherwise. Autoware should not ask users to install some packages too easily. As a user and developer, I don't want Autoware to mess with my local environment. Hence, the development environment must stay solely inside containers. This PR goes the opposite. That's the whole point.

I oppose setup-dev-env.sh or similar opaque scripts for the same reason. Please evaluate twice when asking users to install things. The same goes for package dependencies. I hope you see what bothers me but it requires an organization-wide understanding rather than individual efforts. I would appreciate it if Autoware became more strict about these matters.

@doganulus the thing is, as we are importing all of the source files with an opaque vcs import then doing rosdep install on top of the imported src, that means that there is no direct way of achieving actual docker-caching mechanism if we keep vcs import inside the dockerfile.

So with the new src-imported layer we now have seperate layer just for vcs-import, but the source changes can't be detected again, so now @youtalk completely excluded the vcs-import and copy source files as directory which effectively enables docker-caching.

I don't really know if this solution of yours would still detect source changes, so if you can create another branch with your solution then we can test out and see if excluding vcs import really needed ?

oguzkaganozt avatar May 22 '24 14:05 oguzkaganozt

if you can create another branch with your solution then we can test out and see if excluding vcs import really needed ?

Please describe how you see a user or a developer to use these containers. How frequently? Then we can understand the thing you try to improve makes sense or not.

  • Will the user ever download source code?
  • Will the developer use bind mounts instead of downloading?

What would you need to produce after these efforts? I think these points are not clear.

doganulus avatar May 22 '24 15:05 doganulus

docker image expectations

@doganulus in the first paragraph here you can see the use cases:

  1. The devel image enables you to develop Autoware without setting up the local development environment.
  2. The runtime image contains only runtime executables and enables you to try out Autoware quickly.

I would assume, the devel image workflow should include cloning the autoware repositories and users should be expected to mount the autoware directory. This is to have the changes persistent.

vcstool on the host machine

To me, vcs tool is an alternative to git submodules. Also relatively easy to install: pip install --no-cache-dir vcstool so I see no issues having this much on the host side. git, docker, vcstool

I'm not good at docker caching mechanisms so I cannot comment on if this is the best approach.

At least, it was the 2nd thing that was recommended by chatgpt, I didn't like its first proposal.

What about just changing that run step?

Instead of moving vcs out, would this approach work @youtalk -san?

RUN --mount=type=ssh \
  rm -rf src \
  && mkdir src \
  && vcs import src < autoware.repos \
  && apt-get update \
  && rosdep update \
  && DEBIAN_FRONTEND=noninteractive rosdep install -y --ignore-src --from-paths src --rosdistro "$ROS_DISTRO" \
  && apt-get autoremove -y && apt-get clean -y && rm -rf /var/lib/apt/lists/* "$HOME"/.cache

Unrelated topic

image

@oguzkaganozt I was looking into https://autowarefoundation.github.io/autoware-documentation/main/installation/autoware/docker-installation/#prerequisites and noticed the setup-dev-env.sh doesn't have any "docker" references, so is this documentation wrong or outdated?

xmfcx avatar May 22 '24 16:05 xmfcx

Instead of moving vcs out, would this approach work @youtalk -san?

@xmfcx It has the same cache problem. The RUN command will not be re-executed unless autoware.repos is modified. Note that it will also be re-executed if BASE_IMAGE is changed or if the contents of the COPY files before the RUN command are modified.

youtalk avatar May 23 '24 00:05 youtalk

When executing ./setup-dev-env.sh, by removing the --download-artifacts option, there was enough image size to pass the CI. Instead, we will download and mount the artifacts when executing /docker/run.sh. I'm testing on https://github.com/youtalk/autoware/pull/18

youtalk avatar May 23 '24 00:05 youtalk

@oguzkaganozt https://github.com/autowarefoundation/autoware/pull/4738#issuecomment-2124790959 I understood. Thank you for your explanation.

https://github.com/autowarefoundation/autoware/pull/4738#issuecomment-2124802226 I've merge the latest main branch https://github.com/autowarefoundation/autoware/pull/4738/commits/da2960b0f0a1aa3d265d6e60d60957ec3cfcb68a and restart the docker-build-and-push-main. https://github.com/autowarefoundation/autoware/actions/runs/9200680271

youtalk avatar May 23 '24 01:05 youtalk

Both of no-cuda and cuda were finally succeeded to run docker-build-and-push-main. https://github.com/autowarefoundation/autoware/actions/runs/9200680271/job/25307638386 https://github.com/autowarefoundation/autoware/actions/runs/9200680271/job/25307638224 So, let's merge!

youtalk avatar May 23 '24 04:05 youtalk