circt
circt copied to clipboard
[Nightly CI] CCache hit rates have become <1%
The nightly ccache hit rates used to be -- on average -- pretty good, but they've since dropped to <1%. This has led to every nightly build being 1 to 2 hours.
Any idea when this was better? Looking through action logs it seems they've taken 1-2 hours for few months at least.
Anyway, looks like we're getting cache "misses" when restoring the various ccache data. I think it's likely they're being evicted regularly, here's the current cache state (gh-action-cache list --limit 100, doesn't seem to be a way to access w/o a tool/REST API):
Total caches size 11.32 GB
Showing 37 of 37 cache entries in llvm/circt
gcc-Release-OFF-OFF-135ca3edfa80b384b9bff70e6ea9193a654c8cae 169.48 MB refs/heads/main 3 minutes ago
clang-Release-ON-OFF-135ca3edfa80b384b9bff70e6ea9193a654c8cae 78.11 MB refs/heads/main 5 minutes ago
ccache-2022-08-26T13:28:24.241Z 98.47 MB refs/heads/main 6 minutes ago
ccache-2022-08-26T09:09:49.993Z 98.45 MB refs/heads/main 8 minutes ago
Linux-llvm-fe0f72d5c55a9b95c5564089e946e8f08112e995-55ca506cba6bd9ba7c19cd1fe0021dc5-gcc 4.88 GB refs/heads/main 8 minutes ago
Linux-llvm-fe0f72d5c55a9b95c5564089e946e8f08112e995-55ca506cba6bd9ba7c19cd1fe0021dc5-clang 241.38 MB refs/heads/main 8 minutes ago
Windows-llvm-relobj-fe0f72d5c55a9b95c5564089e946e8f08112e995 1.15 GB refs/heads/main 8 minutes ago
gcc-Release-OFF-ON-135ca3edfa80b384b9bff70e6ea9193a654c8cae 199.21 MB refs/heads/main 10 minutes ago
ccache-2022-08-26T13:22:20.098Z 99.26 MB refs/pull/3781/merge 12 minutes ago
gcc-Release-ON-ON-135ca3edfa80b384b9bff70e6ea9193a654c8cae 199.28 MB refs/heads/main 12 minutes ago
ccache-2022-08-26T13:21:44.952Z 99.21 MB refs/pull/3781/merge 12 minutes ago
ccache-2022-08-26T13:18:37.324Z 99.21 MB refs/pull/3781/merge 16 minutes ago
clang-Debug-OFF-ON-135ca3edfa80b384b9bff70e6ea9193a654c8cae 235.05 MB refs/heads/main 20 minutes ago
clang-Release-ON-ON-135ca3edfa80b384b9bff70e6ea9193a654c8cae 93.92 MB refs/heads/main 24 minutes ago
clang-Release-OFF-ON-135ca3edfa80b384b9bff70e6ea9193a654c8cae 93.94 MB refs/heads/main 28 minutes ago
clang-Release-OFF-OFF-135ca3edfa80b384b9bff70e6ea9193a654c8cae 78.12 MB refs/heads/main 28 minutes ago
clang-Debug-ON-ON-135ca3edfa80b384b9bff70e6ea9193a654c8cae 233.75 MB refs/heads/main 29 minutes ago
clang-Debug-OFF-OFF-135ca3edfa80b384b9bff70e6ea9193a654c8cae 242.52 MB refs/heads/main 32 minutes ago
clang-Debug-ON-OFF-135ca3edfa80b384b9bff70e6ea9193a654c8cae 242.37 MB refs/heads/main 33 minutes ago
ccache-2022-08-26T07:47:29.700Z 98.22 MB refs/heads/main 4 hours ago
ccache-2022-08-25T23:30:43.687Z 98.21 MB refs/heads/main 6 hours ago
ccache-2022-08-26T07:01:24.806Z 98.45 MB refs/pull/3780/merge 7 hours ago
ccache-2022-08-25T18:03:14.765Z 98.16 MB refs/heads/main 14 hours ago
ccache-2022-08-25T23:06:56.008Z 97.91 MB refs/pull/3765/merge 14 hours ago
ccache-2022-08-25T00:47:11.782Z 96.19 MB refs/pull/3765/merge 15 hours ago
ccache-2022-08-25T22:12:34.486Z 98.53 MB refs/pull/3779/merge 15 hours ago
ccache-2022-08-25T20:27:51.313Z 98.47 MB refs/pull/3779/merge 15 hours ago
ccache-2022-08-25T15:08:33.008Z 96.62 MB refs/heads/main 20 hours ago
ccache-2022-08-25T17:40:06.141Z 98.07 MB refs/pull/3742/merge 20 hours ago
ccache-2022-08-25T16:07:34.412Z 97.94 MB refs/pull/3742/merge 20 hours ago
macos-11-llvm-fe0f72d5c55a9b95c5564089e946e8f08112e995-b72a1f4828f4cbefeb1c0be0f9bc9fb3e701500c-clang 414.13 MB refs/tags/sifive/1/14/0 21 hours ago
ubuntu-20.04-llvm-fe0f72d5c55a9b95c5564089e946e8f08112e995-b72a1f4828f4cbefeb1c0be0f9bc9fb3e701500c-clang 464.00 MB refs/tags/sifive/1/14/0 21 hours ago
ccache-2022-08-25T13:05:32.498Z 97.65 MB refs/pull/3742/merge 21 hours ago
ubuntu-18.04-llvm-fe0f72d5c55a9b95c5564089e946e8f08112e995-b72a1f4828f4cbefeb1c0be0f9bc9fb3e701500c-clang 470.70 MB refs/tags/sifive/1/14/0 a day ago
ccache-2022-08-25T14:44:50.345Z 96.50 MB refs/heads/main a day ago
ccache-2022-08-25T14:42:43.785Z 98.83 MB refs/pull/3766/merge a day ago
ccache-2022-08-25T00:03:09.605Z 96.46 MB refs/heads/main a day ago
There are some rather large entries, maybe we can trim them down? The cache limit is 10GB (although apparently not immediately/strictly enforced). The ccache- entries might not be ideal with the date in the key string (shortIntegrationTests.yml), since I don't think we'll ever use anything but the most recent. :thinking:
Looking at the latest run all ccache caches were restored (!), causing 8/16 configurations to complete in about or less than 5 minutes. These were the Release configurations, and showed 34-50% hit rates. For the ones that still took ~2hours, despite the cache being found they exhibit poor hit rates (< 1%).
Weekends are slow, so we may not be out of the woods yet re:ensuring nightly caches are available but at least they sometimes are now :smile:-- bouncing through past weekends I don't see a time the cache was found in recent history.
Looks like maybe something re:debug build is interfering/causing all caches entries to miss? Having the cache available (limited to 300MB) has negligible impact on hit rate-- in at least some cases no restored cache at all has the same hit rate as finding it successfully, for example: https://github.com/llvm/circt/runs/8050184803?check_suite_focus=true <-- no cache found, 0.87% https://github.com/llvm/circt/runs/8057440984?check_suite_focus=true <-- cache found, 0.87%
IIRC, this is mostly consistent with what @mikeurbach and I were seeing when we initially set up the ccache. We think it implies that the wrong ccache is being restored, but didn't look into it further.
Yes this is what we saw before: cache hits on release but not debug, and the release half of the build matrix completing fast while the debug half takes a whole rebuild because there are no hits. I spent a good deal of time reading and debugging the cache github action, and I didn't see any bug or issue where the wrong cache was restored. I think more likely, the large debug artifacts are overflowing the cache and making it essentially useless. Just looking at a local build, the debug mlir-opt on macos is 417MB, and lots of libraries are in the 5-10MB range.