mill icon indicating copy to clipboard operation
mill copied to clipboard

[WIP]Implement reproducible out/ folder contents across different filesystem layouts

Open rahat2134 opened this issue 4 months ago • 0 comments

This PR addresses issue #3660 by making the out/ folder contents more reproducible and filesystem layout agnostic. These changes allow for re-using the out/ folder as a build cache between different machines, supporting both coarse-grained (e.g., zip file transfer) and fine-grained (via Bazel remote cache protocol) caching strategies.

Key Changes:

  • Updated PathRef to normalize paths relative to workspace, Coursier cache, and home directory
  • Implemented NonDeterministicFiles to handle non-deterministic file content
  • Updated JsonFormatters to use the new path normalization methods
  • Added integration test ReproducibleOutTest to verify reproducibility
  • Updated build.mill in the integration test to define the necessary project structure

Reproducibility is achieved by:

  1. Normalizing paths in serialized PathRefs
  2. Handling non-deterministic files (e.g., mill-profile.json, worker JSONs)
  3. Zeroing out modification times for zip and jar files

Testing:

  • Added new integration test ReproducibleOutTest
  • Existing tests have been updated and pass

Documentation:

  • Updated relevant comments in the code

Performance Impact:

  • Initial testing shows minimal impact on build times, but more extensive performance testing may be needed for large projects

rahat2134 avatar Oct 17 '24 12:10 rahat2134