bazel-lib icon indicating copy to clipboard operation
bazel-lib copied to clipboard

Rewrite write_source_files into a fast language

Open dzbarsky opened this issue 6 months ago • 3 comments

On a write_source_files tree with ~20 chained targets and a few files each (so total 90 files + directories) the runtimes goes from 5-7s to 20ms.

We will want to precompile the helper binary like the rest of the helpers.

Putting this up to get feedback first.

dzbarsky avatar Jul 02 '25 19:07 dzbarsky

@alexeagle I updated the PR with a sketch of how some code can be shared. There's a few things bothering me:

  1. I don't see how the existing implementation of copy_directory handles the executable bit (I'm not sure how Clonefile works but the fallback implementation just copies the bytes and doesn't seem to create the file properly?)
  2. I think the implementation I came up with does fewer syscalls (for example it only issues MkdirAll when writes fail, it first attempts to copy a file and then fallsback to directory handling instead of issuing upfront Stat, etc)
  3. The existing io.Copy does not reuse buffers, we should likely allocate a single buffer in the worker and reuse that with io.CopyBuffer
  4. It feels like we are promoting the copy_directory walker as the "common" one, which creates a bit of an oddness for copy_to_directory
  5. CopyFile requires an upfront Stat on srcPath to get the info, and we also need that to do that to decide whether to issue a CopyFile or CopyDirectory. In the current draft this happens serially and then execution of all copy actions is done by the worker pool, but ideally there wouldn't be a serial portion at all. This would mean doing the "try to copy as file and fallback to directory" inside the worker itself, are we ok with that? (It would likely slightly speedup existing copy_directory as well...)

Overall I can't tell if going down this path will create a mess or result in improvements to the existing implementations, but if we do want to unify we will probably want some prep PRs

dzbarsky avatar Jul 03 '25 18:07 dzbarsky

I'm sorry I disappeared here! TBH I'm not a great reviewer for this, I don't think I ever reviewed earlier versions (@gregmagolan wrote it originally and I think @thesayyn made changes).

If your analysis leads you to think that sharing code isn't beneficial here, I'm on board with that. These are small programs, and distributing them is easy. I think the highest-order bit is to make it easy to maintain and avoid bugs since we don't have any funding for this org.

alexeagle avatar Jul 21 '25 15:07 alexeagle

It would be great if write_source_files also didn't touch the destination file if there is no diff. Otherwise it unnecessarily triggers lsp:s to reload files and such.

r0bobo avatar Sep 07 '25 10:09 r0bobo