torchx icon indicating copy to clipboard operation
torchx copied to clipboard

moved image diff logging to their respective workspace

Open ccharest93 opened this issue 1 year ago • 1 comments

#809 was flawed and didn't change the programming logic. I tried to think of any way for build_workspace_and_update_role() to return whether a change in the built image is detected. There isn't one, so the logging message needs to be printed at the workspace level:

  • tmpDirworkspace -> always creates a directory using workspace
  • dirworkspace -> has no logic to be able to determine if the directory is changed
  • dockerworkspace -> used docker_client to determine whether a new image is built

The alternative would be to not test for changes to the image and always print the same thing, or modify build_workspace_and_update_role() abstract function and its implementations to return a BOOL of whether a change happened.

In docker workspace we add a test whether a cfg.image_repo is provided because local_docker does not define it/ use it (it is accepted as an argument because its a runopts for dockerworkspace)

ccharest93 avatar Feb 11 '24 20:02 ccharest93

After reviewing, the easiest way to not affect other implementations would just be to print an extra message within the docker_workspace. You would then end up with contradictory message when reusing an image however. E.g

torchx 2024-03-28 20:38:29 INFO     Building workspace docker image (this may take a while)...
torchx 2024-03-28 20:38:29 INFO     Reused image with tag `sha256:4aad7a6e24d58bb66404dead0c5b2f91e97acb76875c69e238dbabb9b8cd3be0`
torchx 2024-03-28 20:38:29 INFO     Built new image `sha256:4aad7a6e24d58bb66404dead0c5b2f91e97acb76875c69e238dbabb9b8cd3be0` based on original image `nvcr.io/nvidia/pytorch:23.10-py3` and changes in workspace `#######`

Personally, i no longer require knowing through the console whether or not my image tag was updated or reused. Implementing the log message properly seems to be more likely to cause harm then good. Going forward, i can either:

  1. Modify the PR to just be a change within docker workspace that prints an extra message within it.
  2. I can close this PR, but it would be good to rollback #809 in that case since it added complexity without actually fixing the problem.

ccharest93 avatar Mar 28 '24 20:03 ccharest93