bat
bat copied to clipboard
so slow at alpine (linux/386)
What steps will reproduce the bug?
➜ ~ time bat helloWorld.c
───────┬───────────────────────────────────
│ File: helloWorld.c
───────┼───────────────────────────────────
1 │ #include<stdio.h>
2 │
3 │ int main(){
4 │ printf("Hello World!\n");
5 │ return 0;
6 │ }
───────┴───────────────────────────────────
bat helloWorld.c 5.26s user 0.07s system 99% cpu 5.383 total
➜ ~ time cat helloWorld.c
#include<stdio.h>
int main(){
printf("Hello World!\n");
return 0;
}
cat helloWorld.c 0.01s user 0.00s system 82% cpu 0.012 total
What happens? bat take so much time
What did you expect to happen instead? Be fast then 5s....
How did you install bat
?
Alpine:
echo "@testing http://dl-cdn.alpinelinux.org/alpine/v3.16/community" >> /etc/apk/repositories
apk add bat@testing
bat version and environment
➜ ~ bat --diagnostic
Software version
bat 0.18.1 (v20210212-7674-g0e2220d5bb)
Operating system
Operating system type
Command-line
bat --diagnostic
Environment variables
SHELL=/bin/zsh
PAGER=less
LESS=-R
BAT_PAGER=<not set>
BAT_CACHE_PATH=<not set>
BAT_CONFIG_PATH=<not set>
BAT_OPTS=<not set>
BAT_STYLE=<not set>
BAT_TABS=<not set>
BAT_THEME=<not set>
XDG_CONFIG_HOME=<not set>
XDG_CACHE_HOME=<not set>
COLORTERM=<not set>
NO_COLOR=<not set>
MANPAGER=<not set>
Config file
Could not read contents of '/root/.config/bat/config': No such file or directory (os error 2).
Compile time information
- Profile: release
- Target triple: i586-alpine-linux-musl
- Family: unix
- OS: linux
- Architecture: x86
- Pointer width: 32
- Endian: little
- CPU features: fxsr,sse,sse2
- Host: i586-alpine-linux-musl
Less version
> less --version
less 581 (POSIX regular expressions)
Copyright (C) 1984-2021 Mark Nudelman
less comes with NO WARRANTY, to the extent permitted by law.
For information about the terms of redistribution,
see the file named README in the less distribution.
Home page: https://greenwoodsoftware.com/less
-->
That is slow indeed :-)
Could you please run a few tests? Is it also slow if you run with bat --pager=never
?
If so, can you please run strace -f -c bat --pager=never helloWorld.c
to see if there are any syscalls that take a long time?
If possible, could you run perf record --call-graph=dwarf bat --pager=never helloWorld.c
and provide us the generated perf.data
file?
Unfortunately bat --pager=never
still slow, also both strace
and pref
show me sys call error.
➜ ~ strace -f -c bat --pager=never helloWorld.c
strace: test_ptrace_get_syscall_info: PTRACE_SETOPTIONS: Invalid argument
➜ ~ perf record --call-graph=dwarf bat --pager=never helloWorld.c
[1] 108 invalid system call perf record --call-graph=dwarf bat --pager=never helloWorld.c
Can you also try with the latest version of bat please?
Can you also try with the latest version of bat please?
I try build latest version, works but:
I build it like this: https://hub.docker.com/r/messense/rust-musl-cross/tags
x7mdnet@cloudshell:~/$ docker pull messense/rust-musl-cross:i586-musl
x7mdnet@cloudshell:~/$ alias rust-musl-builder='docker run --rm -it -v "$(pwd)":/home/rust/src messense/rust-musl-cross:i586-musl'
x7mdnet@cloudshell:~/$ git clone --recursive https://github.com/sharkdp/bat
x7mdnet@cloudshell:~/$ cd bat
x7mdnet@cloudshell:~/bat/$ rust-musl-builder cargo build --release --bins
x7mdnet@cloudshell:~/bat/$ file target/i586-unknown-linux-musl/release/bat
target/i586-unknown-linux-musl/release/bat: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically linked, stripped
Can you also try with the latest version of bat please?
I try build latest version, works but:
![]()
I build it like this: https://hub.docker.com/r/messense/rust-musl-cross/tags
x7mdnet@cloudshell:~/$ docker pull messense/rust-musl-cross:i586-musl
x7mdnet@cloudshell:~/$ alias rust-musl-builder='docker run --rm -it -v "$(pwd)":/home/rust/src messense/rust-musl-cross:i586-musl'
x7mdnet@cloudshell:~/$ git clone --recursive https://github.com/sharkdp/bat x7mdnet@cloudshell:~/$ cd bat x7mdnet@cloudshell:~/bat/$ rust-musl-builder cargo build --release --bins
x7mdnet@cloudshell:~/bat/$ file target/i586-unknown-linux-musl/release/bat target/i586-unknown-linux-musl/release/bat: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically linked, stripped
It's faster btw xD... but It's missing up the terminal without color.
./bat helloWorld.c 0.25s user 0.01s system 87% cpu 0.302 total
Fork it and build it using GitHub Action CI/CD
https://github.com/X7md/bat/actions/runs/5096205185?notification_referrer_id=NT_kwDOAzsSmbM2NTgzNzY3NDI5OjU0MjAzMDMz
still the same:
which terminal emulator are you using?
which terminal emulator are you using?
https://ish.app, but I don't think it's because of it
I tested in (TinyEMU alpine x86 a WebBase Linux Emulator) same results...
which terminal emulator are you using?
https://ish.app, but I don't think it's because of it
I tested in (TinyEMU alpine x86 a WebBase Linux Emulator) same results...
It's works and look pretty perfect (no weird output) through SSH btw in my pi zero aarch64
... so don't think it's from the terminal emulator
Seeing this issue, I wanted to see how musl (Alpine's libc implementation) stacked up against glibc.
Testing Methodolgy
Unfortunately, I don't have a non-64-bit x86 machine laying around. I used Docker's --platform=linux/i386
to pull in 32-bit libc versions, though.
- On a single machine, run two Docker instances.
- Instance A is using
alpine:latest
. - Instance B is using
debian:latest
.
- Instance A is using
- Install the latest version of hyperfine (1.18.0) using packages provided under the GitHub releases.
- Instance A used the
tgz
-packaged musl version. - Instance B used the
tgz
-packaged gnu version.
- Instance A used the
- Install the latest version of bat from the GitHub releases.
- Instance A used the
tgz
-packaged musl version. - Instance B used the
tgz
-packaged gnu version.
- Instance A used the
- Use
cat
to create yourhelloWorld.c
file. - Run
./hyperfine -N -w 100 -r 500 "./bat helloWorld.c --paging=never --terminal-width=80 --force-colorization"
on both.- 100 warm-up runs.
- 500 measurement runs.
- Does not create a new shell instance to run bat.
- Colorization is forced to prevent bat's simple printer from being used.
- The pager is disabled.
- The terminal width is assumed to be 80 characters.
Results
Alpine:
Time (mean ± σ): 15.7 ms ± 1.7 ms [User: 13.3 ms, System: 2.3 ms]
Range (min … max): 11.9 ms … 21.2 ms 500 runs
Debian:
Time (mean ± σ): 26.9 ms ± 2.6 ms [User: 16.6 ms, System: 10.1 ms]
Range (min … max): 20.6 ms … 39.1 ms 500 runs
The distro using glibc
is faster than musl
for small files, but it's nothing on the order of seconds different. I know I don't have the most ideal hardware to test this (since the kernel is still amd64
), but I feel like we can at least rule out the possibility of it being related to which libc implementation is being used.
Speculation
It's possible that building for i586
is disabling optimizations that use various SIMD extensions, including SSE (which was added with the i686/P6 architecture according to Wikipedia). SSE added 8 new registers and a bunch of instructions, and a quick and dirty dissasembly of the i686 release in this tarball shows that those features are used about 32000 times:
objdump --disassembler-color=extended -d bat-v0.24.0-i686-unknown-linux-gnu/bat | grep -E 'xmm|movss|movaps|movups|movlps|movhps|movlhps|movmskps|addss|subss|mulss|divss|rcpss|sqrtss|maxss|minss|rsqrtss|addps|subps|mulps|divps|sqrtps|maxps|minps|rsqrtps|cmpss|comiss|ucomiss|cmpps|shufps|unpckhps|unpcklps|cvtsi2ss|cvtss2si|cvttss2si|cvtpi2ps|cvtps2pi|cvttps2pi|andps|orps|xorps|andnps|pmulhuw|psadbw|pavgb|pavgw|pmaxub|pminub|pmaxsw|pminsw|pextrw|pinsrw|pmovmskb|pshufw|mxcsr|ldmxcsr|stmxcsr|movntq|movntps|maskmovq|prefetch0|prefetch1|prefetch2|prefetchnta|sfence' | wc -l
That's only about 3.5% of the total instruction count of that executable, though. I'd love to know if it makes that significant of a difference, but I don't have a toolchain to build for i586 set up at the moment.