cosmopolitan icon indicating copy to clipboard operation
cosmopolitan copied to clipboard

ape loader fails with busybox dd

Open mischief opened this issue 2 years ago • 3 comments

in a dockerfile like so:

FROM alpine:3

RUN apk add git make bash zip
RUN git clone --depth=1 https://github.com/jart/cosmopolitan.git /build/cosmopolitan
RUN make -C /build/cosmopolitan -j -l4 -O O=o//tool/net/redbean.com

we see:

o//third_party/quickjs/qjsc.com.tmp.68677: exec: line 13: /build/cosmopolitan/o/tmp/ape-loader: not found

`make MODE= -j16 o//third_party/quickjs/qjscalc.c` exited with 127:
o//third_party/quickjs/qjsc.com -fbignum -o o//third_party/quickjs/qjscalc.c -c third_party/quickjs/qjscalc.js
consumed 634µs wall time
ballooned to 1,088kb in size
needed 595us cpu (0% kernel)
caused 133 page faults (100% memcpy)
7 context switch (100% consensual)
performed 160 read and 0 write i/o operations

make: *** [third_party/quickjs/quickjs.mk:134: o//third_party/quickjs/qjscalc.c] Error 127
make: Entering directory '/build/cosmopolitan'
make: Leaving directory '/build/cosmopolitan'
The command '/bin/sh -c make -C /build/cosmopolitan -j -l4 -O O=o//tool/net/redbean.com' returned a non-zero code: 2

poking around the build dir by hand we can uncover:

o//test/libc/release/smoke-nms.com.tmp.70227: exec: line 13: /build/cosmopolitan/o/tmp/ape-loader: not found

`make MODE= -j16 o//test/libc/release/smoke-nms.com.runs` exited with 127:
o//test/libc/release/smoke-nms.com
consumed 634µs wall time
ballooned to 1,100kb in size
needed 546us cpu (0% kernel)
caused 132 page faults (100% memcpy)
7 context switch (100% consensual)
performed 160 read and 0 write i/o operations

make: *** [build/rules.mk:76: o//test/libc/release/smoke-nms.com.runs] Error 127
make: Leaving directory '/build/cosmopolitan'

stracing the build with strace -f -o /tmp/strace.txt make -C /build/cosmopolitan/ MODE= -j16 o//test/libc/release/smoke-nms.com.runs shows:

/ # grep dd /tmp/strace.txt | tail -n 10
70257 set_tid_address(0x7ffb9a458f90)   = 70257
70257 stat("/usr/local/sbin/dd", 0x7ffc5f877c40) = -1 ENOENT (No such file or directory)
70257 stat("/usr/local/bin/dd", 0x7ffc5f877c40) = -1 ENOENT (No such file or directory)
70257 stat("/usr/sbin/dd", 0x7ffc5f877c40) = -1 ENOENT (No such file or directory)
70257 stat("/usr/bin/dd", 0x7ffc5f877c40) = -1 ENOENT (No such file or directory)
70257 stat("/sbin/dd", 0x7ffc5f877c40)  = -1 ENOENT (No such file or directory)
70257 stat("/bin/dd", {st_mode=S_IFREG|0755, st_size=824984, ...}) = 0
70259 execve("/bin/dd", ["dd", "if=o//test/libc/release/smoke-nm"..., "of=/build/cosmopolitan/o/tmp/ape"..., "skip=    3200", "count=      28", "bs=64"], 0x7ffb9a3be2f8 /* 12 vars */) = 0
70259 set_tid_address(0x7f58fed0bf90)   = 70259
70259 write(2, "dd: invalid number '    3200'\n", 30) = 30

if i apk add coreutils, it succeeds.

i looked at the code in ape/ape.S, but i do not understand where .shstub assembly directive comes from, but it seems that busybox dd doesnt like spaces in number arguments, like

/ # dd 'skip=       1'
dd: invalid number '       1'

can we fix the script in such a way that it is compatible with busybox dd?

mischief avatar Mar 20 '22 22:03 mischief

I run Alpine and I've never had any problems. Maybe that's an old version. The C standard mandates removing spaces. If we padded with zeros then some implementations would interpret it as octal. I tried a long time ago. BusyBox is at fault here.

jart avatar Apr 02 '22 18:04 jart

i just re-tested today with a fresh alpine vm (no docker) and it fails a similar way. i used alpine-standard-3.15.3-x86_64.iso for alpine, and it uses busybox 1.34.1 which is a stable release from sept 2021. in the vm, i installed only the same packages listed above, git make bash zip.

2022-04-03-131413_976x1078_scrot

mischief avatar Apr 03 '22 20:04 mischief

Hello all,

I'm still new here, trying to figure this all out, but thought to comment (if only to help me understand how this works) - so I have a couple comments/questions:

I presume that redbean.com (or the libc test suites?) requires a non-modifiying version of ape.o (ape-no-modify-self.o) for some reason? Because it seems that if it was allowed to modify itself (no APE_LOADER= define), then the dd requirement for running the external ape-loader goes away.

Otherwise looking at ape.S, it seems that the two requirements to use dd, along with both's associated single .shstub macro, are disjoint; that is, according to a comment in ape/macros.internal.h under BCD support, only macOS has the problem of interpreting leading 0's as octal. If that's true, and busybox can't be fixed properly internally, it would seem that the following code in ape.lds could be changed to a new SHSTUB02 macro (producing leading zeros) instead:

SHSTUB02(ape_loader_dd_skip, RVA(ape_loader) / 64);
SHSTUB02(ape_loader_dd_count, (ape_loader_end - ape_loader) / 64);      

Then adding the following defines in ape/macros.internal.h to use BCD0(X) and BCD010K(X) macros to emit leading zeros:

#define SHSTUB02(SYM, X)           \
  HIDDEN(SYM##_bcs0 = BCD010K(X)); \
  HIDDEN(SYM##_bcs1 = BCD0(X))
#define BCD0(X)                                                                 \
  ((X) == 0                                                                    \
       ? 0x30303030                                                            \
       : (X) < 10 ? 0x30303030 + (((X) % 10) << 24)                            \
                  : (X) < 100 ? 0x30303030 + (((X) % 10) << 24) +              \
                                    (((X) / 10 % 10) << 16)                    \
                              : (X) < 1000 ? 0x30303030 + (((X) % 10) << 24) + \
                                                 (((X) / 10 % 10) << 16) +     \
                                                 (((X) / 100 % 10) << 8)       \
                                           : 0x30303030 + (((X) % 10) << 24) + \
                                                 (((X) / 10 % 10) << 16) +     \
                                                 (((X) / 100 % 10) << 8) +     \
                                                 (((X) / 1000 % 10) << 0))
#define BCD010K(X)                                                          \
  ((X) < 10000                                                             \ 
       ? 0x30303030                                                        \
       : (X) < 100000                                                      \ 
             ? 0x30303030 + (((X) / 10000 % 10) << 24)                     \
             : (X) < 1000000                                               \
                   ? 0x30303030 + (((X) / 10000 % 10) << 24) +             \
                         (((X) / 100000 % 10) << 16)                       \
                   : (X) < 10000000                                        \
                         ? 0x30303030 + (((X) / 10000 % 10) << 24) +       \
                               (((X) / 100000 % 10) << 16) +               \
                               (((X) / 1000000 % 10) << 8)                 \
                         : (X) < 100000000                                 \
                               ? 0x30303030 + (((X) / 10000 % 10) << 24) + \
                                     (((X) / 100000 % 10) << 16) +         \
                                     (((X) / 1000000 % 10) << 8) +         \
                                     (((X) / 10000000 % 10) << 0)          \
                               : 0xffffffffffffffff)

This code would never be executed by macOS as its protected by "if [ ! -d /Applications ]" prior. macOS's use of dd would remain as currently coded, having only the non-macOS (busybox) version changed.

This would likely not be a great idea for a production release of Cosmopolitan, just a thought to get a hacked busybox version running for @mischief.

Thank you!

ghaerr avatar Apr 03 '22 23:04 ghaerr

I'm looking at this too. do these integer arguments really need to be quoted? sure, the macro expressions are a little smaller when right-justified, but it should suffice to left-justify the digits and have the user's shell consume the spaces to the right. for example:

# before
skip="      76" count="     128" bs=64
# after
skip=76       count=128     bs=64

I'm trying this out right now and I'll make a PR if no one / no tests have any objections.

notwa avatar Sep 06 '22 21:09 notwa

If you can compute the math to do left alignment then it'd be nice to save a few bytes in the bootloader.

jart avatar Sep 07 '22 01:09 jart