cache icon indicating copy to clipboard operation
cache copied to clipboard

Inspecting the cache

Open FacetGraph opened this issue 5 years ago • 14 comments

Is there a way to view a project's current caches and download them anywhere for inspecting the contents or debugging purposes?

FacetGraph avatar Nov 18 '19 18:11 FacetGraph

@FacetGraph, no there's no current way to inspect or view a project's caches. We're working on some UI and APIs to allow that, but I don't have a timeline for that work.

joshmgross avatar Nov 18 '19 21:11 joshmgross

I think removing them would also be nice!

boredland avatar Nov 22 '19 12:11 boredland

yup.. managing the caches such as removing/downloading would be nice

rashedmyt avatar Nov 29 '19 20:11 rashedmyt

This issue is stale because it has been open for 365 days with no activity. Leave a comment to avoid closing this issue in 5 days.

github-actions[bot] avatar Dec 30 '21 08:12 github-actions[bot]

@joshmgross , any updates to share?

ruffsl avatar Dec 31 '21 06:12 ruffsl

@ruffsl Listing/deleting the cache is being worked upon but there is no expected ship date for that for now. Will update this issue when there is. About downloading the cache, I don't see such a use-case, and there is a possibility of cache being misused as a temporary storage service if that option is provided, which is not the intended use. Can you give more details about where downloading the cache would be useful?

vsvipul avatar Jan 03 '22 09:01 vsvipul

About downloading the cache, I don't see such a use-case, and there is a possibility of cache being misused as a temporary storage service if that option is provided, which is not the intended use.

That sounds like an ordinary user policy problem; addressed via private token access restrictions, conventional resource rate limiting and data caps. The existing actions caching infrastructure is already subject to such use. Having read and write access to this temporary storage service via local or remote code execution already seems to be the intended use here.

Can you give more details about where downloading the cache would be useful?

Having easy access (even if only read access) to the cache would really help accelerate the collaborative development process. allowing maintainers or drive-by-contributors to quickly bootstrapped development environments from prior CI resources. Some examples:

  1. You wish to make a small change to an existing repo, so you open a PR but then it fails CI an hour later. Your a novice contributor and the project is fairly large and complex, with a lengthy build and test process, but you have no access to what the CI compiled to debug yourself, and the time you've volunteered was limited. So you give up, walk away, and your PR languishes unfinished.

  2. You wish to make a small change to an existing repo, so you open a PR but then it fails CI an hour later. The notification includes instructions on how to download and debug the artifacts generated by the CI in a replicable environment (e.g. the same container image or online IDE). Instead of spending over an hour or more repeating the same setup and compilation already completed by the CI, you are immediately dropped into the same cached environment populated by the CI, including precompiled binaries from the PR, ready to rerun tests.

  3. You are a maintainer and your repo receives a PR from a drive-by-contributor. The PR would be nice to merge, but it needs a little polish and your time is split across other priorities. Instead of tossing it into an endless backlog, you open the PR with codespaces or gitpod or whatever that then automatically pre-populates the workspace with the cache downloaded from the PR's most recent CI workflow. You quickly clean up the linter and logic errors and recompile, but because your workspace includes the same cache used by the PR's CI, you benefit from the same incremental compilation the CI does. Thus this only takes seconds rather than an hour of your time or cloud IDE hosting credits.

I could go on, but I think you get the picture I'm trying to paint. Being able to share computational artifacts across CI and development environments seems like an obvious use case here and a logical next feature given GitHub's product offering of Codespaces and prevalent use of self hosted runners and container runtimes with Github Actions.

One thing I'd please ask: that this feature not be siloed off to only Codespaces, limiting it's availability from local hardware development or open source alternatives like Gitpod.

  • https://github.com/features/codespaces
  • https://www.gitpod.io

To provide a real world example, I help maintain a navigation software stack for mobile robotics. Given the project's size, scope and dependencies of the project, it's a bit intimidating for new contributors, despite the copious documentation, and non trivial to build even for maintainers. For reference, it takes about ~40min/1hr to build in release/debug mode from scratch, and about another 30min to test. However, with incremental compilation via caching over containerised CI workflows, these build and test times shrink by over 95%.

  • https://discourse.ros.org/t/ros-docker-pro-tips-and-20-20-hindsights/17274
  • https://vimeo.com/649647161/5b0c278e6c

I almost prefer "developing" on the CI, given the CI is so much faster than my laptop. In fact, I'd like to introduce our team to using Codespaces or Gitpod with GitHub Actions, that could reuse the same CI docker image for consistency, but given the overhead in rebuilding the project for each PR we review every time one of us re-opens Codespaces, it seems wasteful in resource credits and our own dev time in not being able to leverage the computation already expended from prior CI workflows.

ruffsl avatar Jan 09 '22 18:01 ruffsl

@ruffsl Appreciate the discussion! Please correct me if I'm wrong, but would upload-artifacts be better for the example use cases you mention? A few points:

  1. Each artifact is tied to a Actions run whereas caches are shared across runs. Unless you setup the cache key to be per-run, you won't necessarily know which CI run created the cache.
  2. If you're specifically interested in failures, you can conditionally save artifacts. Include the cache, build, and test paths in the upload and you should have everything needed to debug.
  3. Caches are very volatile (due to the 7 days / 10 GB eviction policy) so using it to inspect old CI outputs won't be reliable. Artifacts are retained for 90 days (this is also configurable if you need shorter or longer retention).
  4. Artifacts has a published API for getting content - https://docs.github.com/en/rest/reference/actions#artifacts

The one place where caching could be beneficial is example 3 where you could restore dependencies on codespaces to get the dev environment setup faster. But artifacts should also work? Find the last CI build for that branch/PR and download the artifacts using the API. If using the cache, you'd also have to be mindful of partial matches (if using restore-keys).

dhadka avatar Jan 09 '22 19:01 dhadka

Hi @dhadka, happy to further the discussion. Don't mean to hijack the thread, so we could start a different ticket if desired.

  1. Each artifact is tied to a Actions run whereas caches are shared across runs. Unless you setup the cache key to be per-run, you won't necessarily know which CI run created the cache.

We do indeed setup the cache key to be 'per-run' specific, as far as denoting the arch/branch/PR#. Our keys also include a checksum of a reproducible generated lockfile that is derived from a repeatable environmental setup. Lastly we also save including an epoch timestamp that uniquely bumps the cache key as the latest matching key prefix to be restored. Forgive the config syntax, as it's still uses CircleCI, but it demonstrates the use case as described:

        - restore_cache:
            name: Restore Cache << parameters.key >>
            keys:
              - "<< parameters.key >>-v6\
                -{{ arch }}\
                -{{ .Branch }}\
                -{{ .Environment.CIRCLE_PR_NUMBER }}\
                -{{ checksum  \"<< parameters.workspace >>/lockfile.txt\" }}"
              - "<< parameters.key >>-v6\
                -{{ arch }}\
                -main\
                -<no value>\
                -{{ checksum  \"<< parameters.workspace >>/lockfile.txt\" }}"

https://github.com/ros-planning/navigation2/blob/14051a6971efca0aaf51a5ef3df01b24e83b1098/.circleci/config.yml#L33-L45

        - save_cache:
            name: Save Cache << parameters.key >>
            key: "<< parameters.key >>-v6\
              -{{ arch }}\
              -{{ .Branch }}\
              -{{ .Environment.CIRCLE_PR_NUMBER }}\
              -{{ checksum  \"<< parameters.workspace >>/lockfile.txt\" }}\
              -{{ epoch }}"
            paths:
              - << parameters.path >>/.ccache
              - << parameters.path >>/build
              - << parameters.path >>/install
              - << parameters.path >>/log
              - << parameters.path >>/test_results
            when: << parameters.when >>

https://github.com/ros-planning/navigation2/blob/14051a6971efca0aaf51a5ef3df01b24e83b1098/.circleci/config.yml#L59-L73

Note that the list of keys to restore from includes a string variant pointing to main branch as a fallback. This is beneficial in bootstrapping the cache for fresh PRs, and I believe would be just as applicable when opening a codespace for a new branch.

  1. If you're specifically interested in failures, you can conditionally save artifacts. Include the cache, build, and test paths in the upload and you should have everything needed to debug.

Hmm, I was under the impression that artifacts where more for consumable assets, like readable log files or executable binaries; selected and itemized files appropriate for archiving and aggregating a job's results, rather than whole directory trees or recursive folder paths.

For instance, while cache's are inherently intended to be restored over existing filesystems, artifacts don't seem to share that same functional purpose, as there are no similar restore (as opposed to just save) actions for artifacts across multiple paths inplace. I suppose it could be replicated via a prefix and postfix of tarball commands, but it seems silly to re-implement what cache action does here already.

For example, from the snippet above, both the ccache and build directories are included among the cached paths. Both ccache and cmake are notorious in the amount of auto-generated files they produce, easily in the hundreds or thousands; perhaps inappropriate to be individually uploaded as separate artifacts.

  1. Caches are very volatile (due to the 7 days / 10 GB eviction policy) so using it to inspect old CI outputs won't be reliable. Artifacts are retained for 90 days (this is also configurable if you need shorter or longer retention).

I think the limited durability of cache saves is to be expected, especially for free public storage services. If a maintainer follows up on a CI job the next day and can retrieve the cache, that's awesome. But if they stall for more than a week and the cache gets evicted before they respond, then that's on them, with extended cache durability being a justifiable premium feature.

# Release Mode
Found a cache from build 24502 at overlay_ws-v6-arch1-linux-amd64-6_85-main-<no value>-cW3a07dAyXqQu7OKkkuviZJfehF2oGR7lpqfR6kHceI=
Size: 102 MiB
Cached paths:
  * /opt/overlay_ws/.ccache
  * /opt/overlay_ws/build
  * /opt/overlay_ws/install
  * /opt/overlay_ws/log
  * /opt/overlay_ws/test_results

Downloading cache archive...
Validating cache...

Unarchiving cache...
# Debug Mode
Creating cache archive...
Uploading cache archive...
Stored Cache to overlay_ws-v6-arch1-linux-amd64-6_85-main-<no value>-Ftk0f3CMUUeRCCzHcRj723YJ6mLiSIm6zO9hzKWNEpM=-1641677419
  * /opt/overlay_ws/.ccache
  * /opt/overlay_ws/build
  * /opt/overlay_ws/install
  * /opt/overlay_ws/log
  * /opt/overlay_ws/test_results
Total size uploaded: 930 MiB

For example, our cache save size per job is about ~100MiB for Release builds, or ~1GiB for Debug builds (wish I could shrink the latter but the debug symbols for code coverage add a lot to the binaries). We already use a nightly CI job for the main branch, so regardless of the eviction window or capacity, there's usually always a main cache to bootstrap from.

Artifacts has a published API for getting content - https://docs.github.com/en/rest/reference/actions#artifacts If using the cache, you'd also have to be mindful of partial matches (if using restore-keys).

It would be nice to have such a matching API for caches, but I suppose if one could restore artifacts much in the same manner and ease as caches, then that could alleviate the desire for external cache access. Although, the key look up behavior with caching would still be greatly missed, as it would be wonderful to have the same consistent restoration logic for Codespaces environments as in CI environments. Indexing caches using keys is a proven and accustomed technique for developers, while remaining flexible for most restoration use cases.

ruffsl avatar Jan 11 '22 01:01 ruffsl

Just to chime in here with my use case, I'm currently trying to debug some weird CI failures that I think are due to bad data being cached. Currently I'm fairly stuck with the near-complete lack of visibility into caches. It would be very helpful to have the following operations (UI would be great, but I'd settle for an API):

  • For a given cache key, identify which run published it
  • For a given cache key, download it locally (so I can see if the issue is actually related to the cache)

kevpar avatar Mar 24 '22 17:03 kevpar

Would be helpful to be able to see the cache, or at the very least be able to turn up the verbosity. I just had an issue where path: ~/vcpkg didn't work but path: vcpkg did and this would have saved me a bit of time, at least while learning Actions.

voltagex avatar May 06 '22 06:05 voltagex

Good News folks, we have shipped the list caches API which will help you to list caches for a repository. Go ahead and try it out and let us know feedback if any in this issue itself. Here are the official docs- https://docs.github.com/en/rest/actions/cache#list-github-actions-caches-for-a-repository

We have also shipped APIs for delete. Do check them out as well.

vsvipul avatar Jun 27 '22 12:06 vsvipul

@vsvipul I checked the API and works nicely but it does not allow me to get any information about what is included inside the cache archive, not to download it.

Lets take a look:

    {
      "id": 1217,
      "ref": "refs/pull/536/merge",
      "key": "Linux-test-830c369d75c9f00b28ece026875d0a17f81b1aba030c60ea6677dc55d8541304",
      "version": "57f734843592bf0b28ccd05b278c3da55be62dce0115b462306771e77f965b04",
      "last_accessed_at": "2022-06-28T09:15:01.600000000Z",
      "created_at": "2022-06-27T15:24:21.870000000Z",
      "size_in_bytes": 684074638
    },

Ok so we know the key, size and date, but what is inside? Without knowing the list of files with their sizes from the archive we cannot really say we inspected the archive. We need info like this in order to decide what is missing from the cache or what needs to be removed.

ssbarnea avatar Jun 28 '22 10:06 ssbarnea

@ssbarnea Hi! Unfortunately, downloading or viewing files inside the cache archive is something we are not targeting in the near future.

For your use case, I would suggest you to ssh into the runner, and inspect the directory paths which you mention in the cache step directly. There is not supposed to be any case in which files on runner are different from what we cache in our action (unless you change them after the cache step).

Please let me know if your case is different from this, and how inspecting the cache will still be helpful. I'd be happy to help take this forward.

vsvipul avatar Jun 28 '22 10:06 vsvipul

We've released UI to manage your caches from the web interface as well. https://github.blog/changelog/2022-10-20-manage-caches-in-your-actions-workflows-from-web-interface/ Do check it out.

I've also renamed this issue to specify that it is for downloading the cache contents.

vsvipul avatar Oct 21 '22 10:10 vsvipul

Making the cache public allows developers to reuse this cache locally too. For example, Gradle provides a (remote) build cache feature to skip cached tasks reducing computing time and providing a greater developer experience. Enabling and storing the cache on CI is nice, but from developer view it is somehow unexpected to not be able to use the cache locally.

This also affects GitHub Codespaces which does not reuse the action cache at the moment, so reusing the cache reduces (your) costs.

I think the current APIs don't require a change, you only need to make the ACTION_CACHE_URL public and stable: https://github.com/nektos/act/issues/329 (Of course, keep the authentication to prevent broken/malicious cache updates)

Implementing the actually Gradle /build system cache logic should/could be handled by the community.

See also https://github.com/orgs/community/discussions/44567

hfhbd avatar Jan 17 '23 11:01 hfhbd

This issue is stale because it has been open for 200 days with no activity. Leave a comment to avoid closing this issue in 5 days.

github-actions[bot] avatar Oct 18 '23 08:10 github-actions[bot]

This issue was closed because it has been inactive for 5 days since being marked as stale.

github-actions[bot] avatar Oct 23 '23 08:10 github-actions[bot]