action-cache-http wish: support input directory as an alternative to lockfile+install command

A common pattern for web apps is to have a folder of frontend end assets when then get processed. For example, compiling TypeScript files to JavaScript.

It would be nice to cache these built assets as well, even though they don't have a lock file.

Instead of a lockfile, we can calculate a SHAsum of the input directory. This StackOverflow post provides an example syntax:

https://superuser.com/questions/458326/sha1sum-for-a-directory-of-directories

For example. I have a node_modules directory that is also 1 GB in size, but I'm able to find the SHAsum of it about a second:

 find node_modules/ -type f -print0 | xargs -0 sha1sum | awk '{print $1}' | sha1sum | awk '{print $1}'

sha1sum and sha256sum are part of the coreutils module, so it's almost certainly installed if the other dependencies are met.

May 03 '22 21:05 markstos

Hmm, that is fine, but that won't work due to the following reasons. We make judgments if the cache exists based on the lock file. In case the lock file is not there, then the shasum of package.json can be taken into account. If we choose to take shasum of node_modules, then node_modules should be present there in the first place, which would require an install, meaning there will never be a cache hit. Am I missing something?

May 04 '22 08:05 kevincobain2000

I'm thinking of a flow like this.

Let's say we have src and build directories. The goal is to not rebuild the project if the inputs haven't changed.

We create a SHA representing of the contents of src
We check the cache to see there's a related build artifact.
If we get a cache hit, we unpack the result into ./build.
If we get a cache miss, we run the build, populate ./build, create a tar file and add it to the cache.

In practice, we may need additional inputs to create the cache key. For example, if some dependency changes that's used in the build process, then it could produce a different build output. So a file that lists dependencies package.json would also need to be included in the cache key.

May 04 '22 14:05 markstos

I think what I really want is the generic actions/cache behavior to work on self-hosted runners. This issue comment suggests it might work now, with the caveat that the cache is stored on Github servers, not my own server.

Consider the build product is derived from source code stored on Github, that might be OK in my case. https://github.com/github/docs/issues/2271#issuecomment-1069373652

May 04 '22 14:05 markstos

I understand it now. I think* I understand it now. You would like to cache any folder of your choice based on a condition of your choice. This can just be done right now without implementing it as a new feature, no? Sample below:

    - name: Cache Key
       run: |
           MYVAR=`find my_build_dir/ -type f -print0 | xargs -0 sha1sum | awk '{print $1}' | sha1sum | awk '{print $1}'`
           echo $MYVAR > $MYVAR.lock
    - name: Build (with cache)
      uses: kevincobain2000/action-cache-http@v3
      with:
        version: ${{ matrix.node-versions }}
        lock_file: ${MYVAR}.lock
        install_command: npm build -o my_build_dir
        destination_folder: my_build_dir
        // Speed up caching at the cost of more storage space
        disable_compression: true
        operating_dir: "./" # optional
        cache_http_api: "https://yourdomain.com/path/to/installation/cache-http"
        http_proxy: ""

May 05 '22 12:05 kevincobain2000

action-cache-http action-cache-http copied to clipboard

wish: support input directory as an alternative to lockfile+install command

action-cache-http
action-cache-http copied to clipboard