ozone icon indicating copy to clipboard operation
ozone copied to clipboard

HDDS-10273. Intermittent build failure while downloading nodejs

Open adoroszlai opened this issue 9 months ago • 2 comments

What changes were proposed in this pull request?

Download from nodejs.org seems to be unreliable.

This PR fixes intermittent failures in:

  • compile (8, macos-12), which previously had to download Node.js for each run, since the binary tarball is platform-specific, and we only cached the one for Linux
  • various checks in the rare case that populate-cache runs into download problem

Changes:

  1. Download both Linux and Mac versions of Node.js. Use curl, with retry enabled.
  2. Indicate partially built cache by creating a marker file, which is deleted if all downloads succeed.
  3. Log cache contents even in case of cache hit, making it easier to check contents.
  4. Allow triggering populate-cache workflow manually for easier testing.

https://issues.apache.org/jira/browse/HDDS-10273

How was this patch tested?

Triggered the workflow manually in my fork: https://github.com/adoroszlai/ozone/actions/workflows/populate-cache.yml

Node.js tarballs from latest run: https://github.com/adoroszlai/ozone/actions/runs/9018931552/job/24780630992#step:8:300

adoroszlai avatar May 09 '24 14:05 adoroszlai

Overall idea makes sense, but why do we do so much manual cache management? It seems like https://github.com/actions/setup-java and https://github.com/actions/setup-node can handle it for us. Are there extra requirements we have?

errose28 avatar May 09 '24 21:05 errose28

Overall idea makes sense, but why do we do so much manual cache management? It seems like https://github.com/actions/setup-java and https://github.com/actions/setup-node can handle it for us. Are there extra requirements we have?

  1. setup-java didn't have this feature when we first started using cache. Its simple config seems to correspond to our initial implementation with cache, so if this was available at the time, we could have chosen it.
  2. setup-java does not give control over building a cache vs. only using it. Ozone has many concurrent checks which execute different Maven plugins. Basic checks like checkstyle can benefit from using the cache, but shouldn't build it, since they do not download many of the dependencies. So we rely on cache/save and cache/restore.
  3. Building the cache from scratch in a separate workflow:
    • can produce a smaller cache, by not having to carry old dependencies forever,
    • helps avoid cache expiry, which may be a problem in forks with little activity.

adoroszlai avatar May 10 '24 08:05 adoroszlai

Thanks @errose28, @smengcl for the review.

adoroszlai avatar May 14 '24 06:05 adoroszlai

It turns out cache/save cannot overwrite existing cache, so I reverted this. Will post new PR.

adoroszlai avatar May 14 '24 09:05 adoroszlai