tendra icon indicating copy to clipboard operation
tendra copied to clipboard

Bootstrap-rebuild

Open partouf opened this issue 5 years ago • 23 comments

At least fixes https://github.com/tendra/tendra/issues/7

Although no idea if this is the correct way of course - With these changes there is no reference to errno anymore in the symbol table of the object files (well, the 1 I tested with readelf) and now contains a reference to __errno_location instead.

Either way, the original error doesn't occur anymore, but here's the next one:

==> Linking src/tcc
/opt/compiler-explorer/tendra-source/tendra/obj.ubi2-bootstrap/bin/tcc -Yposix -Xp  -o /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/tcc /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/_partial/src/shared/_partial.o /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/archive.o /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/compile.o /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/environ.o /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/execute.o /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/filename.o /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/flags.o /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/lexer.o /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/list.o /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/main.o /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/options.o /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/stages.o /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/startup.o /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/utility.o /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/table.o /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/hash.o /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/temp.o 
/usr/lib/i386-linux-gnu/libc_nonshared.a(atexit.oS): In function `atexit':
(.text+0x11): undefined reference to `__dso_handle'
/usr/bin/ld: /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/tcc: hidden symbol `__dso_handle' isn't defined
/usr/bin/ld: final link failed: Bad value
*** Error code 1

partouf avatar Aug 14 '20 17:08 partouf

Trying to grasp some of this atexit/dso_handle thing - reading https://stackoverflow.com/questions/34308720/where-is-dso-handle-defined and https://itanium-cxx-abi.github.io/cxx-abi/abi.html#dso-dtor-runtime-api .

From what I'm getting is that it's perhaps just a matter of defining __dso_handle and a dummy implementation (unless a c++ program), will try that out.

partouf avatar Aug 14 '20 17:08 partouf

I was pondering about __dso_handle some more and realized that maybe we could just do what Tendra handles crt0 crt1 and crtn. But I don't know what the logic should be of when to add them and when not.

But that (https://github.com/tendra/tendra/pull/37/commits/ff820323faf782adfb52451d7d5c3c1765d19db3) was the last step I had to do into making bootstrap-rebuild work.

I do wonder if there's not a better way of handling this. The paths for crtbegin and end are very specific to which gcc and libraries are used. Maybe there should be some sort of a --gcc-toolchain thing like Clang has.

partouf avatar Aug 15 '20 17:08 partouf

btw I'm using Linux version 4.15.0-88-generic (buildd@lgw01-amd64-036) (gcc version 7.4.0 (Ubuntu 7.4.0-1ubuntu1~18.04.1)) #88-Ubuntu SMP

And this is my system's gcc atm:

Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/7/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 7.5.0-3ubuntu1~18.04' --with-bugurl=file:///usr/share/doc/gcc-7/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --prefix=/usr --with-gcc-major-version-only --program-suffix=-7 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)

But this is my compiler-explorer VM, so for testing purposes, I have most of the compilers the live site has too.

partouf avatar Aug 15 '20 17:08 partouf

Hm! I see what you're doing here. But we definitely can't depend on gcc's crt files :) (And yes, the paths will be different for each gcc).

We could provide our own. We already do that for some systems - e.g. https://github.com/tendra/tendra/tree/main/osdep/machines/solaris/sparc/src

If this is the way to go, I think it would be just the same idea. I think crtbegin.s (and therefore __dso_handle) would be for C++, and I'm not sure what we'd need to provide for it.

There also seems to be /usr/lib/i386-linux-gnu/crt1.o

katef avatar Aug 17 '20 01:08 katef

/usr/lib/i386-linux-gnu/crt1.o comes from libc6-dev:i386, I think this is the right thing to depend on.

katef avatar Aug 17 '20 06:08 katef

Hm! I see what you're doing here. But we definitely can't depend on gcc's crt files :) (And yes, the paths will be different for each gcc).

We could provide our own. We already do that for some systems - e.g. https://github.com/tendra/tendra/tree/main/osdep/machines/solaris/sparc/src

If this is the way to go, I think it would be just the same idea. I think crtbegin.s (and therefore __dso_handle) would be for C++, and I'm not sure what we'd need to provide for it.

There also seems to be /usr/lib/i386-linux-gnu/crt1.o

Although it's true that dso_handle is for the C++ ABI, it is required by libc.

Here's a tiny discussing between some LLVM people that needed to start implementing it http://llvm.1065342.n5.nabble.com/llvm-dev-Providing-dso-handle-in-LLVM-td109516.html

The best thing todo would probably be to put it in TCC as a generation step when __dso_handle is not defined by the user code (as I think you're supposed to be able to implement it yourself if you want alternate behaviour), but I think that's a little complicated. So perhaps the best thing would be to have some custom code to be linked alongside like crt1.s in the example of sparc, indeed.

There are some more things aside from __dso_handle though that might be more complicated. Will try it out and include the other symbols that might be needed, and then figure out what they're for.

partouf avatar Aug 17 '20 15:08 partouf

The llvm thread there is great, good find. What they're saying about being part of the C++ ABI makes sense. It sounds like providing our own __dso_handle is the "right" thing to do.

One last question, do you think we should be using /usr/lib/i386-linux-gnu/crt1.o for any other reason?

katef avatar Aug 18 '20 06:08 katef

Took me a while to realize a lot of code is not actually used or built any more in the project, plus the fact that the code needs to be compiled for multiple architectures.. And there doesn't seem to be a good way to determine the right directory from the default files for this either.

partouf avatar Aug 21 '20 00:08 partouf

I think the path for /usr/lib/i386-linux-gnu/crt1.o is supposed to be a stable name, isn't it?

katef avatar Aug 21 '20 03:08 katef

I think the path for /usr/lib/i386-linux-gnu/crt1.o is supposed to be a stable name, isn't it?

I would assume so.

partouf avatar Aug 21 '20 03:08 partouf

Ok, long story short. I have an implementation (that compiles, but segfaults) - but am now realizing the value of __dso_handle needs to be filled with an address. Not sure how I'm going to do that yet, will need to sleep on it.

partouf avatar Aug 21 '20 04:08 partouf

Figured out how to fill the address, but it didn't solve the segfault. So I guess I'll be debugging gcc+glibc sources to find out what's going on :'(

partouf avatar Aug 21 '20 09:08 partouf

Ok, long story short. I have an implementation (that compiles, but segfaults) - but am now realizing the value of __dso_handle needs to be filled with an address. Not sure how I'm going to do that yet, will need to sleep on it.

Now fixed, apparently putting the data in .sbss causes issues in 32bit mode or something, so I now put it in .data instead and that seems to work fine.

partouf avatar Aug 21 '20 10:08 partouf

I think the path for /usr/lib/i386-linux-gnu/crt1.o is supposed to be a stable name, isn't it?

Finding out that it's not. /usr/lib32 seems to be the better choice. But this is a native path for x86/64 systems, so it wouldn't work for cross compilation from other platforms I think?

I think for now it's better to go for /usr/lib32 instead.

Note: Added bootstrap-rebuild to the Github Actions

partouf avatar Aug 21 '20 10:08 partouf

So the last step in this is to re-examine the errno problem and test it on other platforms

partouf avatar Aug 21 '20 10:08 partouf

But this is a native path for x86/64 systems, so it wouldn't work for cross compilation from other platforms I think?

I think this is fine - because we'd only be setting this path in an env file under the x32_64 target. We wouldn't set this path for either the native i386 or amd64 (when it exists) targets.

katef avatar Aug 22 '20 04:08 katef

Yes exactly, agree. I was looking if there might be a better way to query the system about these paths but haven't found any yet

Sent from my iPhone

On 22 Aug 2020, at 06:11, Katherine [email protected] wrote:

 But this is a native path for x86/64 systems, so it wouldn't work for cross compilation from other platforms I think?

I think this is fine - because we'd only be setting this path in an env file under the x32_64 target. We wouldn't set this path for either the native i386 or amd64 (when it exists) targets.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

partouf avatar Aug 22 '20 11:08 partouf

Ok then, maybe fixed?

I found out there really is no need to refer to __errno_location, all I needed to do was +IMPLEMENT "c/c89", "errno.h.ts" ; I saw a lot of other h.ts files do that, but errno.h.ts did not.

Will this cause any issues you think @katef ?

partouf avatar Aug 24 '20 20:08 partouf

I don't actually know Why this causes issues during rebuild, because the handling for an extra extern int errno causes no problems in the situation that I thought was happening. The macro seems to have precedence in all situations. https://godbolt.org/z/oEvdqY (because the preprocessor makes the errno uses go away and replaces them with __errno_location I assume)

So there must be something "intelligent" going on here during rebuild that makes it use extern int errno instead of __errno_location.

(as soon as you set -Yposix it does the wrong thing https://godbolt.org/z/6z6695) (and funnily enough if you do specs in a reversed order (so -Yposix -Yc89) it works ok again https://godbolt.org/z/YbEqvf)

partouf avatar Aug 27 '20 12:08 partouf

Decided to run -E on this

#include <errno.h>

int main() {
    errno = 12;
    return errno;
}

Which results in the following: https://godbolt.org/z/8aoxTE Left is -Yposix as last argument (results in the bad asm), Right is -Yc89 as last (results in correct asm).

Which makes it clear that only the last spec mentioned is inlined, and apparently has some kind of say in what happens next if it sees the errno symbol. And the macros from the /usr/include/errno.h are not applied in either version yet, is there an extra preprocess round after -E in TenDRA ?

GCC 1.27 and other versions simply show this or something similar after -E:

extern int *__errno_location (void)    ;

int main() {
    (*__errno_location ())  = 12;
    return (*__errno_location ()) ;
}

(Even if it would include /usr/include/errno.h after this step, I would have still expected somewhat of a correct behaviour like in this-> https://godbolt.org/z/b53o8s as in, it keeps the extern errno which will cause issues during linking, but the macro still results in it using __errno_location in the right places)

partouf avatar Aug 27 '20 14:08 partouf

In IEEE Std 1003.1-2008 (and later editions as well) this is said https://pubs.opengroup.org/onlinepubs/9699919799.2008edition/basedefs/errno.h.html#tag_13_10 : The <errno.h> header shall provide a declaration or definition for errno. The symbol errno shall expand to a modifiable lvalue of type int. It is unspecified whether errno is a macro or an identifier declared with external linkage. If a macro definition is suppressed in order to access an actual object, or a program defines an identifier with the name errno, the behavior is undefined.

Interesting to note: Some of the functionality described on this reference page extends the ISO C standard. Any conflict between the requirements described here and the ISO C standard is unintentional. This volume of POSIX.1-2008 defers to the ISO C standard.

I realize was different from IEEE Std 1003.1-1988, but TenDRA only has 1 spec implementation. And the way it's implemented in TenDRA seems to be incompatible with most modern unix's in the case of errno.

And since technically +EXP lvalue int errno; isn't totally incorrect (extern int errno is also an lvalue int), and it works for both old and new posix versions, and that TenDRA requires it in the makefiles for building - how about this: https://github.com/tendra/tendra/pull/37/commits/cdf27f19009f8a7d15f5dc0d8b7f43520af78f78

So that with just posix, you get: tcc -Yposix -S main.c

main:
        call __errno_location
        movl %eax,%edx
        movl $12,(%edx)
        call __errno_location
        movl %eax,%edx
        movl (%edx),%eax
        ret

and with strict_posix you get: tcc -Ystrict_posix -S main.c

main:
        movl $12,errno
        movl errno,%eax
        ret

partouf avatar Aug 27 '20 17:08 partouf

That's a clever idea, with -Ystrict_posix, but I'm afraid I'm not comfortable with that. We already have an idea of strictness, (by way of the -X flags), and I don't like mixing that up with well-defined APIs.

About errno, i propose a few things:

  1. -Yposix is correct, we should leave that as it is. It represents the 1988 spec.
  2. We should provide additional posix revisions. We have a few already, but we'll need the rest anyway.
  3. tdfc2 doesn't actually use errno in the same files which depend on posix - for example in system.c. So let's just have -Yposix for those files specifically.
  4. I would still like to try to remove the dependency where we can, which is what #11 is about.

Especially, the idea of API specs being correct is important for TenDRA; it's what the whole system is based around.

Let's do (3) first (and on a separate branch), I think it makes life easier. I'll try to do that myself this evening, if I can.

katef avatar Aug 28 '20 00:08 katef

I moved out discussion of -Yposix's errno to #57 - because it doesn't help here at all, but it's its own problem regardless. And in #58 I've made the -Yposix flags per file.

Sorry if it seems like a lot of fuss, but I'd like to spend a little more time thinking about this. I'm sure we can find a solution somehow that doesn't depend on implementing a whole new tspec API (yet), and also doesn't involve changing what the -Yposix API claims.

katef avatar Aug 28 '20 06:08 katef