addons icon indicating copy to clipboard operation
addons copied to clipboard

dev_container

Open fsx950223 opened this issue 3 years ago • 10 comments

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
  • TensorFlow version and how it was installed (source or binary):
  • TensorFlow-Addons version and how it was installed (source or binary):
  • Python version:
  • Is GPU used? (yes/no): no Describe the bug Why tfaddons/dev_container:latest-cpu is so big(6.3GB) and has some CUDA layers and lots of apt-update layers which increase image size. A clear and concise description of what the bug is.

Code to reproduce the issue

Create a codespace.

Other info / logs

https://hub.docker.com/layers/tfaddons/dev_container/latest-cpu/images/sha256-e97c0a51c9da13134b9e4f2a27aeee662def8e77ced84224f4dcd90e00cc18d3?context=explore

2021-11-25T06:01:15: [2021-11-25T06:01:15.292Z] failed to register layer: ApplyLayer exit status 1 stdout:  stderr: write /usr/local/lib/python3.9/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so: no space left on device
Error: Command failed: docker build -f /var/lib/docker/codespacemount/workspace/addons/.devcontainer/Dockerfile -t vsc-addons-7dc239d633fc90a0907165f6f5d2c6fb /var/lib/docker/codespacemount/workspace/addons/.devcontainer
    at A7 (/usr/lib/node_modules/@microsoft/vscode-dev-containers-cli/dist/node/devContainersCLI.js:318:7786)
    at async T7 (/usr/lib/node_modules/@microsoft/vscode-dev-containers-cli/dist/node/devContainersCLI.js:318:6090)
    at async pF (/usr/lib/node_modules/@microsoft/vscode-dev-containers-cli/dist/node/devContainersCLI.js:318:2407)
    at async o7 (/usr/lib/node_modules/@microsoft/vscode-dev-containers-cli/dist/node/devContainersCLI.js:312:10911)
    at async n3 (/usr/lib/node_modules/@microsoft/vscode-dev-containers-cli/dist/node/devContainersCLI.js:344:3255)
    at async dae (/usr/lib/node_modules/@microsoft/vscode-dev-containers-cli/dist/node/devContainersCLI.js:344:22780)
    at async fae (/usr/lib/node_modules/@microsoft/vscode-dev-containers-cli/dist/node/devContainersCLI.js:344:22376)

Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.

fsx950223 avatar Nov 25 '21 06:11 fsx950223

Yes with free/eval Codespaces the disk space Is limited but the main issues are:

https://github.com/tensorflow/addons/pull/2598#issuecomment-969821170

And https://github.com/tensorflow/addons/pull/2515

https://discuss.tensorflow.org/t/adopting-open-source-dockerfiles-for-official-tf-nightly-ci/6050/4

/cc @seanpmorgan

bhack avatar Nov 25 '21 11:11 bhack

IMO, codespace container should use a different image instead of devops image.

fsx950223 avatar Nov 25 '21 12:11 fsx950223

Why we need multipython in codespace?

fsx950223 avatar Nov 25 '21 12:11 fsx950223

The point is to have the same developer container as the one we are using in the CI so that we are almost on the same page when we develop TF Addons and when we automatically validate it with the CI without having too much risks to be out of sync between the two envs as It Is seems that this type of drift happens quite often, soon or later, when you have two independent envs.

But now we don't have any CPU image anymore also in the new TF Docker refactoring effort.

.devcontainer is not only about Codespaces but also about development in Vscode with dev own resources and this is why I've put some commented lines to enable GPU options but we don't have a valid upstream CPU image and custom ops TF images are unmaintaiend (see the mentioned forum thread).

bhack avatar Nov 25 '21 12:11 bhack

Could we specific different .devcontainer for different envs? We could separate latest-cpu and latest-gpu docker image. We could run CI/CD without .devcontainer before add it.

fsx950223 avatar Nov 25 '21 13:11 fsx950223

We could separate latest-cpu and latest-gpu docker images. If you see the image type was already an arg controlled by .devcontainer:

https://github.com/tensorflow/addons/blob/41eaa27d49025c02bfe9520d5e63e1f01a782ddf/.devcontainer/Dockerfile#L1

The problem is that the image on our (Addons) DockerHUB registry is de-facto a GPU one after https://github.com/tensorflow/addons/pull/2598#issuecomment-969821170 was merged.

I've prepared an upstream PR to start to separate baseline (CPU) and CUDA layers: https://github.com/tensorflow/build/pull/47

We still need to work with comments in the same single .devcontainer as Vscode/Codespace still really works with the default .devcontainer.

See more at: https://github.com/microsoft/vscode-remote-release/issues/1165 https://github.com/microsoft/vscode-remote-release/issues/3279

bhack avatar Nov 25 '21 14:11 bhack

TF doesn't want to accept the contribution of an intermediate CPU target based with a small refactoring of their own new receipt https://github.com/tensorflow/build/pull/47#issuecomment-981990104.

So when we are going to merge @seanpmorgan (and mine) https://github.com/tensorflow/addons/pull/2515 we still have all the CUDA layer overhead.

I will accept any suggestion but on my side I don't want to maintain multiple Dockerfile diverging receipts between the Addons devel env and the Addons CI.

bhack avatar Nov 29 '21 21:11 bhack

/cc @yarri-oss

bhack avatar Dec 01 '21 23:12 bhack

Discussed this in the grooming meeting. It's certainly something we want supported for Addons, but we're not willing to build our own containers given that custom-op image is no longer supported. Lets bring this up at the next SIG build meeting to see if we can get any traction.

seanpmorgan avatar Dec 16 '21 19:12 seanpmorgan

We discussed this in today meeting SIG BUILD meeting but It seems that https://github.com/tensorflow/build/pull/47#issuecomment-981990104 review could not go ahead.

bhack avatar Jan 11 '22 22:01 bhack

TensorFlow Addons is transitioning to a minimal maintenance and release mode. New features will not be added to this repository. For more information, please see our public messaging on this decision: TensorFlow Addons Wind Down

Please consider sending feature requests / contributions to other repositories in the TF community with a similar charters to TFA: Keras Keras-CV Keras-NLP

seanpmorgan avatar Mar 01 '23 04:03 seanpmorgan