musl-cross-make icon indicating copy to clipboard operation
musl-cross-make copied to clipboard

NATIVE inside chroot

Open firasuke opened this issue 4 years ago • 3 comments

Hey there,

What's the general approach for getting mcm to build a native toolchain capable of running inside a chroot environment (kind of similar to the LFS toolchain approach)?

Would it really be a native build in this case, or a cross-native? Will there also be a need to modify the GCC sources accordingly (to point cross compiled tools to musl's dynamic linker (loader) to be able to use them inside the chroot environment)? Will there be a need for a host /output symlink in this case (again similar to how LFS does it)?

How would C++ support be enabled in this case? Would a single pass GCC suffice in this case (which is highly unlikely)? If it won't suffice then a 2 pass GCC should be a must to accomplish this cross-native toolchain with C++ support, with the first GCC being really basic (static with most stuff disabled)...

I've managed to build a similar toolchain manually, for a project of mine (glaucus), you can find the toolchain over at: https://github.com/glaucuslinux/toolchain

but I'm interested in seeing how mcm approaches this, and if it'll work as a suitable alternative for my toolchain.

The closest scenario I've seen where mcm is being used is toybox's mkroot, but I'd like to hear your thoughts on this.

firasuke avatar May 15 '20 14:05 firasuke

Have you tried? No change should be needed to do this. /lib/ld-musl-$(ARCH)$(SUBARCH).so.1 needs to be present in the chroot. You should not change this pathname; the only plausibly reasonable usage case for changing it is a non-chroot setup where you're testing (binaries that won't be portable) in your homedir without root, or similar.

Of course C++ and everything else work. I'm not sure what you mean by "2 pass". In order to build a NATIVE toolchain to begin with, you need to already have a cross compiler toolchain for TARGET, so if you're counting building that as a pass then it's "2 pass". But there is no need for the LFS Rube Goldberg machine. Some parts of LFS are useful, but ignore everything about how it builds GCC and glibc multiple times because it's wrong. Not doing that was the whole point of musl-cross-make.

richfelker avatar May 15 '20 18:05 richfelker

I see so /lib/ld-musl-$(ARCH)$(SUBARCH).so.1 is sacred and shouldn't be touched (maybe only if /usr/lib/ld-musl-$(ARCH)$(SUBARCH).so.1 is desired...), it should also be available at the time of chroot in the directory where chroot will be performed (so it either has to be the result of installing packages directly into the directory then chrooting into it, or in a separate directory (which is how LFS does it /tools) then maybe "relatively" symlinking lib or $TUPLE/lib with the directory where chroot will be performed)...

Yes, that's what I meant by 2 pass. A first compilation for creating the cross compiler for target where only $TARGET is specified (and mostly $BUILD=$HOST), and a second compilation for creating the native compiler to be used inside the chroot environment where $BUILD!=$HOST and $HOST=$TARGET).

I also have to agree with how the LFS project just over complicated everything, but they have a working project so I can't argue with what works (at least its correct in a certain context). Plus the maximum host isolation is preferred for some users (with the exception of the /tools symlink created on the host system).

So if I understood correctly, to do this we need to basically have two stages (b can be grouped to c): a. A Cross Compilation Stage to target a musl target (let's assume that the musl target is x86_64-linux-musl):

[ NOTE ] This stage will most likely happen on a Glibc host system where both $BUILD and $HOST are equal to some gnu tuple (let's say x86_64-pc-linux-gnu which is what most Glibc based distributions are going with these days).

  1. The first thing to be built is a cross binutils that targets x86_64-linux-musl and has BUILD=HOST=x86_64-pc-linux-gnu. This binutils will be usable only on the Glibc host but not inside chroot, so it should be linked against the host's C library. A default template for configuring this cross binutils should be along these lines:
/some_path/src/binutils/configure \

    --build=x86_64-pc-linux-gnu \ ### This is mostly guessed correctly so it can be omitted

    --host=x86_64-pc-linux-gnu \ ### This is mostly guessed correctly so it can be omitted

    --target=x86_64-linux-musl \ ### This is needed for all this magic to happen

    --prefix=/some_path/tools \ ### either an empty prefix with "DESTDIR=/some_path/tools" or use this prefix, without a DESTDIR. I've noticed that you went with empty prefixes and chose to specify a DESTDIR, any reason behind that besides preferrability?

    --libdir=/some_path/tools/lib \ ### Just in case it decides to act funny on us and tries to use "lib64" instead of "lib". Setting this to "--libdir=/lib" is wrong, because it'll assume it's the host's "/lib" and it won't be automatically appended to our prefix "/some_path/tools"...

    --disable-werror \ <-- Is this even needed at this point?

    --with-lib-path==/some_path/tools/lib \ ### The double "=" is intended to show a correct "SEARCH_DIR" in the form of "=/some_path/tools/lib" instead of "/some_path/tools/lib" when running "x86_64-linux-musl -ld --verbose | grep SEARCH_DIR".... Also shouldn't this be equal to "=/lib" only since "x86_64-linux-musl-ld" search paths will be automatically prefixed by "SYSROOT"?

    --with-sysroot=/some_path \ ### the most problematic flag... it's here set to the base directory of "/tools" which is what LFS does, where in mcm it is set to $(TARGET) so it becomes $(OUTPUT)/$TARGET upon installation (equivalent in our example to "/some_path/tools/x86_64-linux-musl"... shouldn't this flag eventually point to "/some_path/tools/x86_64-linux-musl"? also will "x86_64-linux-musl" prefix it to the value provided to "--with-lib-path" above?

    --disable-multilib <-- for pure 64-bit support
  1. The second thing to be built is a cross/static gcc that only knows how to build musl, and has most of its features removed. This cross gcc as well will only target x86_64-linux-musl and will only be usable on the host system and not inside the chroot environment as it'll be linked against the host's C library. A default template for configuring this cross/static gcc should be along these lines:
CFLAGS='-g0 -O0' \ ### Do not optimize cross/static GCC because it only knows how to build musl, and nothing else (shouldn't cross GCC build native GCC as well)? This will bloat the toolchain, unless "install-strip" is specified...

CXXFLAGS='-g0 -O0' \ ### Do not optimize cross/static GCC because it only knows how to build musl, and nothing else (shouldn't cross GCC build native GCC as well)? This will bloat the toolchain, unless "install-strip" is specified...

LDFLAGS=-s \ ### Is this even needed?

/some_path/src/gcc/configure \

    --build=x86_64-pc-linux-gnu \ ### This is mostly guessed correctly so it can be omitted

    --host=x86_64-pc-linux-gnu \ ### This is mostly guessed correctly so it can be omitted

    --target=x86_64-linux-musl \ ### This is needed for all this magic to happen

    --prefix=/some_path/tools \ ### either an empty prefix with "DESTDIR=/some_path/tools" or use this prefix, without a DESTDIR. I've noticed that you went with empty prefixes and chose to specify a DESTDIR, any reason behind that besides preferrability?

    --libexecdir=/some_path/tools/lib \ ### because we're modern

    --libdir=/some_path/tools/lib \ ### won't have as much effect as patching "t-linux64" for removing "lib64", and both won't work on a 64-bit Glibc host with lib64 on (e.g. Fedora)

    --with-local-prefix=/some_path/tools \ ### LFS specifies this to keep "/usr/local" outside of this cross gcc search path, but is it even needed?

    --with-native-system-header-dir=/tools/include \ ### Another problematic flag, since it's getting prefixed by the even more troublesome flag "--with-sysroot", this would amount to "/some_path/tools/include"... in the case of mcm it looks in "/some_path/tools/x86_64-linux-musl/include" instead due to sysroot being "/x86_64-linux-musl"?

    --disable-shared \ ### since it's a static GCC

    --disable-multilib \ ### since it's a pure 64-bit build

    --disable-cet \ ### combined with latest upstream patch for cet to work with GCC 10.1.0

    --enable-threads=single \ ### "--disable-threads" is an alias for this option, to disable threads support

    --with-arch=x86-64 \ ### further optimize for our example target

    --disable-bootstrap \ ### speed up build time at the expense of ???

    --enable-languages=c,c++ \ ### To be able to build the 2nd pass (native) gcc, because GCC has parts of it written in C++ and C++ support is needed in cross GCC

    --disable-libsanitizer \ ### needed when using musl as the default C library

    --disable-libssp \ ### Disable unneeded feature here

    --disable-libquadmath \ ### Disable unneeded feature here

    --disable-libgomp \ ### Disable unneeded feature here

    --disable-libvtv \ ### Disable unneeded feature here

    --disable-werror \ ### Specify this to prevent some errors?

    --disable-nls \ ### Disable unneeded feature here

    --disable-decimal-float \ ### Disable unneeded feature here

    --disable-plugin \ ### Is plugin support even needed?

    --disable-lto \ ### Is this even needed here, as musl doesn't build with LTO enabled (the following full featured native GCC will have LTO enabled for sure)?

    --with-sysroot=/some_path \ ### Again should this be the base directory for "/tools" or should it be the "/x86_64-linux-musl" inside "/some_path/tools" (which is our prefix)...

    --without-headers \ ### Disable header support

    --with-newlib \ ### Resulting GCC will still get built against Glibc...

    --disable-libatomic \ ### Disable unneeded feature here

    --disable-libstdcxx ### This doesn't need to be enabled for GCC to be built with C++ support, or is it? This would require libstdc++-v3 to be installed later on after musl has been built (which is what LFS does)...

For building cross GCC, *_FOR_TARGET need to be set equal to the newly created cross binutils tools (are these *_FOR_TARGET variables even needed here?), also :

  make \

    AR_FOR_TARGET=x86_64-linux-musl-ar \

    AS_FOR_TARGET=x86_64-linux-musl-as \

    LD_FOR_TARGET=x86_64-linux-musl-ld \

    NM_FOR_TARGET=x86_64-linux-musl-nm \

    OBJCOPY_FOR_TARGET=x86_64-linux-musl-objcopy \

    OBJDUMP_FOR_TARGET=x86_64-linux-musl-objdump \

    RANLIB_FOR_TARGET=x86_64-linux-musl-ranlib \

    READELF_FOR_TARGET=x86_64-linux-musl-readelf \

    STRIP_FOR_TARGET=x86_64-linux-musl-strip

Regarding the installation for both cross binutils and cross gcc, no DESTDIR is required if --prefix is specified. In mcm's case --prefix is left empty while DESTDIR is specified.

Also for cross gcc, several sources (e.g. CLFS Embedded x86 add all-gcc all-target-libgcc to make for both building and installation instead of the regular make && make install, is this actually needed?

b. A middle stage where the C library musl is built. --host is chose over --target (similar to what mcm does) as this is a native build and because --target is automatically guessed? A basic template to how musl is configured should be along the lines:

  CROSS_COMPILE=x86_64-linux-musl- \ ### This can be removed if --host is detected correctly as x86_64-linux-musl, otherwise it should be left here...

  ./configure \

    --host=x86_64-linux-musl- \ ### should this be `--target=x86_64-linux-musl` instead?

    --prefix=/some_path/tools \ ### again, empty prefix and DESTDIR situation as above

    --syslibdir=/some_path/tools/lib \ ### Specifying --prefix and --syslibdir, will get musl installed (+ libc.so) in --prefix as intended and will get the tricky linker installed in the host's /usr/lib (which is why make install requires sudo)...

    --disable-wrapper \ ### This disables wrappers for both `clang` (as it won't be used) and `gcc` (as the `musl-gcc` isn't suitable as a distribution toolchain + it lacks good C++ support).

    --disable-static ### musl will be dynamically linked

For musl's installation, specifying DESTDIR with make install, when using --prefix above (a trailing slash is needed when both DESTDIR and prefix are used together; otherwise it'll install everything into $DESTDIR$PREFIX. This will also symlink the libc.so residing in $DESTDIR$PREFIX to $DESTDIR alone.

Dragora also seems to use this patch:

https://notabug.org/dragora/dragora/src/master/patches/musl/musl-nolibcc_stage1.diff

I also noticed that in mcm you're specifying --with-build-sysroot when building GCC, is it because you're pausing GCC before musl and resuming it after musl is built?

c. The final stage is the native toolchain stage (this stage is what we actually need since coming up with a working native toolchain from the start without going through the hassle of getting our own cross toolchain, will result in a toolchain that won't work inside the chroot environment? Why is the cross toolchain even needed at this point? Why wouldn't a native toolchain with a single GCC build work at this point?):

  • This native stage will be built entirely with the cross compilation toolchain, so everything built here will have the following variables (maybe set as env variables):
AR=x86_64-linux-musl-ar

AS=x86_64-linux-musl-as

CC=x86_64-linux-musl-gcc

CPP="x86_64-linux-musl-gcc -E"

CXX=x86_64-linux-musl-g++

LD=x86_64-linux-musl-ld

NM=x86_64-linux-musl-nm

OBJCOPY=x86_64-linux-musl-objcopy

OBJDUMP=x86_64-linux-musl-objdump

RANLIB=x86_64-linux-musl-ranlib

READELF=x86_64-linux-musl-readelf

SIZE=x86_64-linux-musl-size

STRINGS=x86_64-linux-musl-strings

STRIP=x86_64-linux-musl-strip
  1. Is it required to build libstdc++-v3 at this point to build native GCC (since parts of GCC are written in C++) with the cross GCC (since we couldn't build libstdc++-v3 earlier because it depends on the C library (musl) and musl wasn't available then (I understand that libstdc++-v3 is part of GCC and would get automatically built when C++ language support is enabled and --disable-libstdcxx isn't specified)?

  2. For native binutils, the configuration is the exact same as with cross binutils with the exception that HOST is now equal to x86_64-linux-musl as well. Also after its installation, is a rebuild of binutils's ld only with LIB_PATH changed to accommodate with the lib directory inside the chroot environment actually needed?

  3. For the final native gcc, the configuration will be similar to cross gcc (with regards to directory layout) with most features turned back on, with the exception of --disable-libsanitizer and --disable-werror since these are needed to get it to link against musl with no issues.

Also, are GCC modifications actually needed to get it to work with the custom /usr/lib (particularly MUSL_DYNAMIC_LINKER64 to /usr/lib/ld-musl-x86_64.so.1)?

I apologize for making this really long, but there are multiple sources out there that are mostly outdated, and I'd expect the author of musl to be the highest authority when it comes to making anything relating to musl.

firasuke avatar May 15 '20 23:05 firasuke

To sum things up, one has to basically create a cross-compiler targetting x86_64-linux-musl that runs on the host system with mcm using:

make TARGET=x86_64-linux-musl install

which will be installed in output, then use the cross compiler created above to create a native compiler using:

PATH= /path_to_mcm/output/bin make TARGET=x86_64-linux-musl NATIVE=1 install

which will be installed in output-x86_64-linux-musl.

I noticed the existence of a CROSS_COMPILE flag in addition to TARGET and NATIVE, does this have any effect on the resulting native toolchain?

Also, I suppose the resulting native toolchain in output-x86_64-linux-musl won't be able to run on the host system, so the cross compiler in output has to also cross compile the basic tools needed to set a chroot environment (shell, userspace utilities...) and install them in output-x86_64-linux-musl because this will become our chroot directory?

firasuke avatar May 17 '20 11:05 firasuke