pimod icon indicating copy to clipboard operation
pimod copied to clipboard

Feature request: checkpoints

Open zabealbe opened this issue 2 years ago • 3 comments

The build try fix iterate approach does not really work while building a .pimod file, rebuilding an image from scratch because of a typo takes way too much time. A very cool feature would be Dockerfile-like checkpoints, or, a CHECKPOINT macro to use in the .pimod file

zabealbe avatar Dec 06 '21 20:12 zabealbe

I see your point, but don't know how one can implement this without getting too much complex for this specific issue.

Back then, we addressed this problem through an iterative approach. Like, e.g., with Ansible Roles, we had a "basic" Pifile extending the vendor's image. Third Pifiles then consumed the already altered image and performed further changes. This is sketched in the paper in figure 3 on page 7.

Of course I am open for better ideas.

oxzi avatar Dec 07 '21 19:12 oxzi

In your case creating checkpoints every RUN would be very expensive as we can’t implement the layers mechanism docker has, my idea is to add a CHECKPOINT directive that saves the image in the .cache and restores it as needed.

So basically it’s like having multiple pimod files daisy chained but automatically and explicitly with a CHECKPOINT directive

zabealbe avatar Dec 09 '21 12:12 zabealbe

Having experimented a bit, I can say that this doesn't work well with the current architecture.

The fundamental difference between Docker's and pimod's design is how the data is stored. Docker uses multiple overlays, each for one of the linear steps within the Dockerfile. In contrast, pimod does not execute its commands as they appear, but within their stage. Between those stages might be functions which are executed when entering of leaving a stage. Commands of the 30-chroot and 40-postprocess stage might be executed within the guest resp. image, resulting in the image being mounted while pimod is within this stage. Thus, sadly, determinism is difficult to achieve.

I tried implementing a simple hash based cache for those pimod commands which should alter the image. While doing so, I experienced unexpected hash changes, e.g., after entering the chroot. It seems like just mounting the image creates a non-determinism.

As we are working with complete images, not just overlays, both saving and loading creates IO load. When using a CoW filesystem like btrfs, this can be reduced by telling cp to use the CoW feature. However, on, e.g., an ext4 file system, the caching takes longer then simple RUNs.

At the moment I feel that with both the current architecture and the limitations/complexity of Bash, this would be better left unimplemented. However, please feel free to have a look at the not working checkpoint branch: https://github.com/Nature40/pimod/compare/checkpoint.

oxzi avatar Dec 12 '21 17:12 oxzi