regression in 20250526.2 for unpacking non-ascii file names
Description of the bug:
admin@ip-172-31-7-56:~/vc/ef2$ rm -rf ~/.cache/bazel*
admin@ip-172-31-7-56:~/vc/ef2$ bazelisk build -k $REDACTED/...
2025/06/11 09:49:36 Downloading https://releases.bazel.build/9.0.0/rolling/9.0.0-pre.20250526.2/bazel-9.0.0-pre.20250526.2-linux-x86_64...
2025/06/11 09:49:36 Skipping basic authentication for releases.bazel.build because no credentials found in /home/admin/.netrc
Downloading: 59 MB out of 59 MB (100%)
Extracting Bazel installation...
Starting local Bazel server (9.0.0-pre.20250526.2) and connecting to it...
INFO: Repository aspect_rules_js+ instantiated at:
<builtin>: in <toplevel>
Repository rule http_archive defined at:
/home/admin/.cache/bazel/_bazel_admin/fd974fe89a796f9c35a0944ed09934ee/external/bazel_tools/tools/build_defs/repo/http.bzl:394:31: in <toplevel>
ERROR: /home/admin/.cache/bazel/_bazel_admin/fd974fe89a796f9c35a0944ed09934ee/external/bazel_tools/tools/build_defs/repo/http.bzl:139:45: An error occurred during the fetch of repository 'aspect_rules_js+':
Traceback (most recent call last):
File "/home/admin/.cache/bazel/_bazel_admin/fd974fe89a796f9c35a0944ed09934ee/external/bazel_tools/tools/build_defs/repo/http.bzl", line 139, column 45, in _http_archive_impl
download_info = ctx.download_and_extract(
Error in download_and_extract: java.io.IOException: Error extracting /home/admin/.cache/bazel/_bazel_admin/fd974fe89a796f9c35a0944ed09934ee/external/aspect_rules_js+/temp11524386281555661874/rules_js-v2.3.7.tar.gz to /home/admin/.cache/bazel/_bazel_admin/fd974fe89a796f9c35a0944ed09934ee/external/aspect_rules_js+/temp11524386281555661874: [unix_jni.cc:281] /home/admin/.cache/bazel/_bazel_admin/fd974fe89a796f9c35a0944ed09934ee/external/aspect_rules_js+/js/private/test/image/non_ascii/empty empty.?? (No such file or directory)
WARNING: errors encountered while analyzing target '//infra/bento_runner/insert:insert_lib', it will not be built.
no such package '@@aspect_rules_js+//npm': java.io.IOException: Error extracting /home/admin/.cache/bazel/_bazel_admin/fd974fe89a796f9c35a0944ed09934ee/external/aspect_rules_js+/temp11524386281555661874/rules_js-v2.3.7.tar.gz to /home/admin/.cache/bazel/_bazel_admin/fd974fe89a796f9c35a0944ed09934ee/external/aspect_rules_js+/temp11524386281555661874: [unix_jni.cc:281] /home/admin/.cache/bazel/_bazel_admin/fd974fe89a796f9c35a0944ed09934ee/external/aspect_rules_js+/js/private/test/image/non_ascii/empty empty.?? (No such file or directory)
WARNING: errors encountered while analyzing target 'REDACTED'
admin@ip-172-31-7-56:~/vc/ef2$ cat .bazelversion
9.0.0-pre.20250526.2
admin@ip-172-31-7-56:~/vc/ef2$ vi .bazelversion
admin@ip-172-31-7-56:~/vc/ef2$ bazelisk build -k REDACTED/...
2025/06/11 09:52:33 Downloading https://releases.bazel.build/9.0.0/rolling/9.0.0-pre.20250516.2/bazel-9.0.0-pre.20250516.2-linux-x86_64...
2025/06/11 09:52:33 Skipping basic authentication for releases.bazel.build because no credentials found in /home/admin/.netrc
Downloading: 59 MB out of 59 MB (100%)
Extracting Bazel installation...
Starting local Bazel server (9.0.0-pre.20250516.2) and connecting to it...
INFO: Analyzed 9 targets (369 packages loaded, 13407 targets and 37 aspects configured).
INFO: Found 9 targets...
INFO: Elapsed time: 219.457s, Critical Path: 70.04s
INFO: 989 processes: 263 internal, 726 linux-sandbox.
INFO: Build completed successfully, 989 total actions
admin@ip-172-31-7-56:~/vc/ef2$
Which category does this issue belong to?
regression
What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
No full repro yet; the last release appears to work normally on dev workstations. The above is from an AWS VM running Debian 12, which I occasionally use.
Which operating system are you running Bazel on?
Linux
If this is a regression, please try to identify the Bazel commit where the bug was introduced with bazelisk --bisect.
I'm sorry; I don't have time for this right now.
Any other information, logs, or outputs that you want to share?
Maybe some kind of encoding issue determined by my locale settings?
$$ set|grep -v CRED|grep '^[A-Z]'
BASH=/bin/bash
BASHOPTS=checkwinsize:cmdhist:complete_fullquote:expand_aliases:extglob:extquote:force_fignore:globasciiranges:globskipdots:histappend:interactive_comments:login_shell:patsub_replacement:progcomp:promptvars:sourcepath
BASH_ALIASES=()
BASH_ARGC=([0]="0")
BASH_ARGV=()
BASH_CMDS=()
BASH_COMPLETION_VERSINFO=([0]="2" [1]="11")
BASH_LINENO=()
BASH_LOADABLES_PATH=/usr/local/lib/bash:/usr/lib/bash:/opt/local/lib/bash:/usr/pkg/lib/bash:/opt/pkg/lib/bash:.
BASH_REMATCH=()
BASH_SOURCE=()
BASH_VERSINFO=([0]="5" [1]="2" [2]="15" [3]="1" [4]="release" [5]="x86_64-pc-linux-gnu")
BASH_VERSION='5.2.15(1)-release'
COLUMNS=119
COMP_WORDBREAKS=$' \t\n"\'><=;|&(:'
DIRSTACK=()
EUID=1000
GROUPS=()
HISTCONTROL=ignoreboth
HISTFILE=/home/admin/.bash_history
HISTFILESIZE=2000
HISTSIZE=1000
HOME=/home/admin
HOSTNAME=ip-172-31-7-56
HOSTTYPE=x86_64
IFS=$' \t\n'
LANG=C.UTF-8
LC_ADDRESS=de_DE.UTF-8
LC_IDENTIFICATION=de_DE.UTF-8
LC_MEASUREMENT=de_DE.UTF-8
LC_MONETARY=de_DE.UTF-8
LC_NAME=de_DE.UTF-8
LC_NUMERIC=de_DE.UTF-8
LC_PAPER=de_DE.UTF-8
LC_TELEPHONE=de_DE.UTF-8
LC_TIME=de_DE.UTF-8
LINES=36
LOGNAME=admin
LS_COLORS='rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=00:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.zst=01;31:*.tzst=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.wim=01;31:*.swm=01;31:*.dwm=01;31:*.esd=01;31:*.avif=01;35:*.jpg=01;35:*.jpeg=01;35:*.mjpg=01;35:*.mjpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.webp=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.m4a=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.oga=00;36:*.opus=00;36:*.spx=00;36:*.xspf=00;36:*~=00;90:*#=00;90:*.bak=00;90:*.old=00;90:*.orig=00;90:*.part=00;90:*.rej=00;90:*.swp=00;90:*.tmp=00;90:*.dpkg-dist=00;90:*.dpkg-old=00;90:*.ucf-dist=00;90:*.ucf-new=00;90:*.ucf-old=00;90:*.rpmnew=00;90:*.rpmorig=00;90:*.rpmsave=00;90:'
MACHTYPE=x86_64-pc-linux-gnu
MAILCHECK=60
MOTD_SHOWN=pam
OLDPWD=/home/admin/vc
OPTERR=1
OPTIND=1
OSTYPE=linux-gnu
PATH=/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
PIPESTATUS=([0]="0" [1]="0")
PPID=300743
PS1='\[\e]0;\u@\h: \w\a\]${debian_chroot:+($debian_chroot)}\[\033[01;32m\]\u@\h\[\033[00m\]:\[\033[01;34m\]\w\[\033[00m\]\$ '
PS2='> '
PS4='+ '
PWD=/home/admin/vc/ef2
SHELL=/bin/bash
SHELLOPTS=braceexpand:emacs:hashall:histexpand:history:interactive-comments:monitor
SHLVL=1
SSH_CLIENT='62.216.209.72 12456 22'
SSH_CONNECTION='62.216.209.72 12456 172.31.7.56 22'
SSH_TTY=/dev/pts/0
TERM=xterm-256color
UID=1000
USER=admin
XDG_RUNTIME_DIR=/run/user/1000
XDG_SESSION_CLASS=user
XDG_SESSION_ID=19
XDG_SESSION_TYPE=tty
I saw a similar failure with the go toolchain, which has a similarly oddly named file.
What are the exact bytes in this filename? In particular, is it valid UTF-8? Which locale is the Bazel server running under?
(I suspect this might be https://github.com/bazelbuild/bazel/commit/9d83d95336721d2dd27a10dba03ed564155eb9d1 but I'd like to understand why this happens before rolling back.)
@tjgq https://github.com/aspect-build/rules_js/tree/main/js/private/test/image/non_ascii
If this is valid UTF-8 we should be able to get this to work without a rollback.
the Go SDK one was /home/admin/.cache/bazel/_bazel_admin/7779369af6aa65e0252b23aaeea35f7a/external/rules_go++go_sdk+go_sdk/test/fixedbugs/issue27836.dir/�foo.go
which appears on my console as
$ ls -l ./test/fixedbugs/issue27836.dir
total 8
-rw-rw-r-- 1 hanwen hanwen 352 Sep 24 2024 Þfoo.go
-rw-rw-r-- 1 hanwen hanwen 363 Sep 24 2024 Þmain.go
$ tar tvfz ~/Downloads/go1.24.4.linux-amd64.tar.gz | grep 27836.*foo | od -xc
0000000 722d 2d77 2d72 722d 2d2d 3020 302f 2020
- r w - r - - r - - 0 / 0
0000020 2020 2020 2020 2020 2020 3320 3235 3220
3 5 2 2
0000040 3230 2d35 3530 322d 2039 3132 333a 2037
0 2 5 - 0 5 - 2 9 2 1 : 3 7
0000060 6f67 742f 7365 2f74 6966 6578 6264 6775
g o / t e s t / f i x e d b u g
0000100 2f73 7369 7573 3265 3837 3633 642e 7269
s / i s s u e 2 7 8 3 6 . d i r
0000120 c32f 669e 6f6f 672e 0a6f
/ 303 236 f o o . g o \n
0000132
My dev workstation has LANG=en_US.UTF8; I tried setting that on the AWS VM too, but didn't make a difference.
Which locale is the Bazel server running under?
if it is not $LANG, how do I determine this?
(note: I edited the top comment to provide more info)
Could you share the output of bazel info character-encoding and locale -a?
admin@ip-172-31-7-56:~/vc/engflow$ locale -a
locale: Cannot set LC_CTYPE to default locale: No such file or directory
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_COLLATE to default locale: No such file or directory
C
C.utf8
POSIX
admin@ip-172-31-7-56:~/vc/engflow$ bazel info character-encoding
file.encoding = ISO-8859-1, defaultCharset = ISO-8859-1, sun.jnu.encoding = ANSI_X3.4-1968
sun.jnu.encoding = ANSI_X3.4-1968
This is the problem. Bazel tries to set this to en_US.ISO-8859-1 if available, but otherwise shouldn't touch it. Since your default locale is determined by C.UTF-8, that should really be what Bazel ends up using instead.
But locale prints errors since you request de_DE.UTF-8 for some aspects of locales without that locale being installed. Perhaps that messes up locale detection within Bazel?
Could you also post the output of locale charmap?
https://github.com/bazelbuild/bazel/pull/26261 reproduces the error.
$ locale charmap
locale: Cannot set LC_CTYPE to default locale: No such file or directory
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_ALL to default locale: No such file or directory
ANSI_X3.4-1968
$ locale charmap locale: Cannot set LC_CTYPE to default locale: No such file or directory locale: Cannot set LC_MESSAGES to default locale: No such file or directory locale: Cannot set LC_ALL to default locale: No such file or directory ANSI_X3.4-1968
While Bazel could do a better job at forcing a UTF-8 locale, this does look like a setup error. This does print UTF-8 on default Debian, Ubuntu and macOS installations. If it doesn't, many programs (not just Bazel) will run into encoding issues.
While Bazel could do a better job at forcing a UTF-8 locale, this does look like a setup error. This does print UTF-8 on default Debian, Ubuntu and macOS installations. If it doesn't, many programs (not just Bazel) will run into encoding issues.
It would be helpful if bazel printed this in an error message.
IIRC this is running the image for our remote execution setup, and it's probably not designed for logging in and doing user-things.
(I have no idea how to fix this; I can find out, but if you have quick hint, that would be cool).
Thanks for the rapid feedback!
admin@ip-172-31 sudo dpkg-reconfigure locales
$ export LC_ALL=en_US.UTF-8
admin@ip-172-31-7-56:~/vc/engflow$ bazel shutdown
admin@ip-172-31-7-56:~/vc/engflow$ bazel info character-encoding
Starting local Bazel server (9.0.0-pre.20250526.2) and connecting to it...
file.encoding = ISO-8859-1, defaultCharset = ISO-8859-1, sun.jnu.encoding = UTF-8
@tjgq What do you think of showing a warning when the configured locale is ASCII only?
@hanwen-flow That looks good and should avoid these errors. I would recommend unsetting all locale variables except for an explicit LC_CTYPE=C.UTF-8. That should give you the most portable environment.
@tjgq What do you think of showing a warning when the configured locale is ASCII only?
Sounds good - would you like to send a ~~CL~~ PR?
A fix for this issue has been included in Bazel 8.4.0 RC1. Please test out the release candidate and report any issues as soon as possible. If you're using Bazelisk, you can point to the latest RC by setting USE_BAZEL_VERSION=8.4.0rc1. Thanks!