bazel-lib icon indicating copy to clipboard operation
bazel-lib copied to clipboard

bsdtar is locale-sensitive

Open alexeagle opened this issue 1 year ago • 1 comments

Callers of libarchive's tar may get a locale from their system.

libarchive is sensitive to LC_ALL (https://github.com/libarchive/libarchive/blob/819a50a0436531276e388fc97eb0b1b61d2134a3/tar/bsdtar.c#L191) and we do enable the preprocessor directive in the BCR entry.

As an example of a problem this causes: Aspect customers have observed errors extracting NPM packages, but only locally on Mac or in a linux devbox but not on CI. See https://github.com/aspect-build/rules_js/issues/2039

Currently users of the tar rule, as well as custom rules that use our tar toolchain, are subject to ambiguity:

  • if the action has use_default_shell_env=True then a setting like --action_env=LC_ALL=C will break ability to work with archives that contain unicode characters in some file path. the tar rule's action does not set that.
  • otherwise the result will depend on the users system, meaning it's not hermetic.

Perhaps we should document that all users of the tar toolchain hard-code a value like LC_ALL=C.UTF-8 and we should add that to the env of actions created by our tar rule?

alexeagle avatar Dec 18 '24 00:12 alexeagle

@fmeum I'm sure you have informed guidance on what's the correct approach here.

alexeagle avatar Dec 18 '24 00:12 alexeagle

Fixed by #1053 - however that assumes that callers of the extract mode propagate the new default_env when spawning a tar action.

alexeagle avatar Mar 04 '25 01:03 alexeagle