avian icon indicating copy to clipboard operation
avian copied to clipboard

0.7 macos build with openjdk claims test failures (misc/datagrams)

Open swadey opened this issue 12 years ago • 17 comments

I'm getting test errors when I try to build avain v0.7 for macos (10.8.4):

            ------- Java tests -------
               AllFloats: success
             Annotations: success
                  Arrays: success
              BitsetTest: success
                 Buffers: success
               Datagrams: fail
             DefineClass: success
            DivideByZero: success
             EnumSetTest: success
                   Enums: success
              Exceptions: success
              FileOutput: success
                   Files: success
              Finalizers: success
                  Floats: success
                      GC: success
                   Hello: success
            Initializers: success
                Integers: success
                     JNI: success
             LazyLoading: success
                    List: success
                 Logging: success
                   Longs: success
                    Misc: fail
             NullPointer: success
             OutOfMemory: success
               Processes: success
                 Proxies: success
              References: success
              Reflection: success
                  Simple: success
           StackOverflow: success
                 Strings: success
              Subroutine: success
                  Switch: success
                 Threads: success
                   Trace: success
                    Tree: success
              UnsafeTest: success
                 UrlTest: success
                     Zip: success

see log.txt for output
make: *** [test] Error 255

JAVA_HOME is:

JAVA_HOME=/Users/swade/compile/openjdk/build/macosx-x86_64/j2sdk-image

Running make like this:

make openjdk="$(pwd)/../openjdk/build/macosx-x86_64/j2sdk-image" openjdk-src="$(pwd)/../openjdk/jdk/src" test

Is this expected?

swadey avatar Jul 27 '13 16:07 swadey

I haven't previously compiled a openjdk-src build on osx. How are you getting openjdk7 compiled? It looks like the page we link to in the readme for building openjdk (https://wikis.oracle.com/display/OpenJDK/Mac+OS+X+Port) has recently been updated for jdk8 - which seems to have a completely different layout of output directory. Even going back several versions to the jdk7 instructions (yay for wikis!), I get build errors.

joshuawarner32 avatar Jul 27 '13 20:07 joshuawarner32

I built it following the jdk7 instructions I found on the wiki. I got it to build successfully and avian did too until I hit the tests.

thanks, wade

On Jul 27, 2013, at 4:28 PM, Joshua Warner [email protected] wrote:

I haven't previously compiled a openjdk-src build on osx. How are you getting openjdk7 compiled? It looks like the page we link to in the readme for building openjdk (https://wikis.oracle.com/display/OpenJDK/Mac+OS+X+Port) has recently been updated for jdk8 - which seems to have a completely different layout of output directory. Even going back several versions to the jdk7 instructions (yay for wikis!), I get build errors.

— Reply to this email directly or view it on GitHub.

swadey avatar Jul 28 '13 03:07 swadey

After correcting the jdk build problems (actually, just looking at the most recent version of the jdk7 build instructions), I'm now getting a single test failure (Files, which looks a bit suspicious):

./build/darwin-x86_64-openjdk-src/avian -cp ./build/darwin-x86_64-openjdk-src/test Files java/lang/UnsatisfiedLinkError: java/io/FileInputStream.read()I at java/io/FileInputStream.read (native) at java/io/FileInputStream.read (native) at Files.main (line 64)

I say that it's suspicious, because the openjdk source doesn't declare a read()I native method, but rather a read0()I native method, which the read()I java method defers to. I've confirmed that THAT method was correctly linked into the executable, and as far as I can tell, the FileInputStream class that avian is using to build is indeed the one from openjdk. So... I'm stumped.

@swadey would you mind trying with master rather than 0.7? I'm wondering if aab1b6e087a72a29051b43d438850ec13156c4bf, in particular would fix your problem.

joshuawarner32 avatar Jul 29 '13 04:07 joshuawarner32

Sorry about the delay. I've been out of commission for a while. I cloned the head and I'm still getting the same failure with head:

linking build/darwin-x86_64-openjdk-src/avian-unittest
            ------- Unit tests -------
               ArgParser: success
          BasicAssembler: success
        ArchitecturePlan: success
        RegisterIterator: success

            ------- Java tests -------
               AllFloats: success
             Annotations: success
                  Arrays: success
              BitsetTest: success
                 Buffers: success
             Collections: success
               Datagrams: fail
             DefineClass: success
            DivideByZero: success
             EnumSetTest: success
                   Enums: success
              Exceptions: success
              FileOutput: success
                   Files: success
              Finalizers: success
                  Floats: success
                      GC: success
                   Hello: success
            Initializers: success
                Integers: success
                     JNI: success
             LazyLoading: success
                    List: success
                 Logging: success
                   Longs: success
                    Misc: fail
             NullPointer: success
             OutOfMemory: success
               Processes: success
                 Proxies: success
              References: success
              Reflection: success
                  Simple: success
           StackOverflow: success
                 Strings: success
              Subroutine: success
                  Switch: success
                 Threads: success
                   Trace: success
                    Tree: success
              UnsafeTest: success
                 UrlTest: success
                     Zip: success

see log.txt for output
make: *** [test] Error 255

swadey avatar Aug 10 '13 23:08 swadey

Darn. All I can think to do at the moment is to offer guidance in debugging the problem, if you don't mind doing some leg work.

The first thing I would do is run those two tests individually to see the output (as log.txt is all run together, with no separators). Something like:

./build/darwin-x86_64-openjdk-src/avian -cp ./build/darwin-x86_64-openjdk-src/test <test_name>

My guess is that it's crashing with either an assertion failure or a segfault. Next, I would make sure that the problem is reproducable in debug mode (adding mode=debug to the make command). Hopefully, you should then be able to run avian under gdb (I don't know how familiar you are with debugging native code), e.g.:

gdb --args <command_to_run_avian>

At the prompt, enter "r" (for "run"). If the problem is a segfault or assertion failure, gdb should catch the signal and pause the program at the problem spot. You can inspect the stack with "bt" (for "backtrace"). If you could copy the stacktrace here (whether it's a java-level one displayed when the test runs, or the native one obtained with gdb) for both failing tests, that'd be great.

joshuawarner32 avatar Aug 11 '13 00:08 joshuawarner32

Running the two failing tests yields:


[splunk:~/compile/avian]⇒ ./build/darwin-x86_64-openjdk-src/avian -cp ./build/darwin-x86_64-openjdk-src/test Datagrams
[1]    63564 abort      ./build/darwin-x86_64-openjdk-src/avian -cp  Datagrams
[splunk:~/compile/avian]⇒ ./build/darwin-x86_64-openjdk-src/avian -cp ./build/darwin-x86_64-openjdk-src/test Misc
java.lang.RuntimeException
        at Misc.syncStatic(Misc.java:73)
        at Misc.main(Misc.java:171)
Sat Aug 10 21:42:11 EDT 2013
x
true
42
123456789012345
75.62
75.62
hi
java/lang/NoClassDefFoundError: Misc$?Class
  at Misc.main (line 249)

Datagrams causes the abort. So I ran that in GDB. See the attached stack trace:

(gdb) set args -cp ./build/darwin-x86_64-openjdk-src/test Datagrams
(gdb) r
Starting program: /Users/swade/compile/avian/build/darwin-x86_64-openjdk-src/avian -cp ./build/darwin-x86_64-openjdk-src/test Datagrams
Reading symbols for shared libraries ++++++............................................................ done
Reading symbols for shared libraries ........................................................................................ done

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x000000000000001a
setIntField (t=0x104808c00, arguments=0x7fff5fbfeb30) at machine.h:3671
3671      if (UNLIKELY(fieldFlags(t, field) & ACC_VOLATILE)) {
(gdb) bt
#0  setIntField (t=0x104808c00, arguments=0x7fff5fbfeb30) at machine.h:3671
#1  0x0000000100091355 in vmRun ()
#2  0x000000010007274a in vm::Thread::Checkpoint::~Checkpoint () at /Users/swade/compile/avian/src/avian/machine.h:1950
#3  0x000000010007274a in vm::Thread::RunCheckpoint::~RunCheckpoint () at /Users/swade/compile/avian/src/avian/machine.h:1455
#4  0x000000010007274a in vm::run () at /Users/swade/compile/avian/src/avian/machine.h:1950
#5  0x000000010007274a in SetIntField (t=0x104808c00, o=0x7fff5fbfeb30, field=1, v=116396736) at jnienv.cpp:1881
#6  0x00000001000b6695 in Java_java_net_Inet6AddressImpl_lookupAllHostAddr (env=0x104808c00, this=0x106f012c0, host=0x7fff5fbfed10) at Inet6AddressImpl.c:350
#7  0x00000001000912e4 in .Lcall () at abort.h:25
#8  0x0000000100002ce2 in vm::dynamicCall (function=0x104808c00, arguments=0x10, argumentTypes=0x7fff5fbff008 "(P?\t\001", argumentCount=116396736, unnamed_arg=1606413112, returnType=1606413040) at x86.h:197
#9  0x0000000100055f4f in invokeNativeSlow (method=0x108111f20, t=0x104808c00, function=0x7fff5fbfef50) at compile.cpp:7778
#10 0x0000000100031bcd in invokeNative (t=0x104808c00) at compile.cpp:7850
#11 0x000000010500007b in ?? ()
#12 0x00000001000552b4 in vm::Thread::Checkpoint::~Checkpoint () at /Users/swade/compile/avian/src/avian/machine.h:8669
#13 0x00000001000552b4 in invoke (thread=0x104808c00, method=0x108112f08, arguments=0x104808c00) at compile.cpp:2271
#14 0x000000010003020c in vm::Thread::Protector::~Protector () at /Users/swade/compile/avian/src/avian/machine.h:9170
#15 0x000000010003020c in vm::Thread::SingleProtector::~SingleProtector () at /Users/swade/compile/avian/src/avian/machine.h:1379
#16 0x000000010003020c in invokeList (method=0x1080c9920, this=0x104808c00, t=0x104808c00, this_=0x0, arguments=0x7fff5fbff4d8) at compile.cpp:9170
#17 0x000000010006b91e in callStaticVoidMethodV (t=0x10, arguments=0x108111f20) at jnienv.cpp:1434
#18 0x0000000100091355 in vmRun () at abort.h:25
#19 0x00000001000741e3 in vm::Thread::Checkpoint::~Checkpoint () at /Users/swade/compile/avian/src/avian/machine.h:1950
#20 0x00000001000741e3 in vm::Thread::RunCheckpoint::~RunCheckpoint () at /Users/swade/compile/avian/src/avian/machine.h:1455
#21 0x00000001000741e3 in vm::run () at /Users/swade/compile/avian/src/avian/machine.h:1950
#22 0x00000001000741e3 in CallStaticVoidMethodV (t=0x104808c00, unnamed_arg=0x108111f20, m=4430010656, a=0x106f012c0) at jnienv.cpp:1444
#23 0x00000001000c61e6 in JNIEnv_::CallStaticVoidMethod (this=0x104808c00, cls=0x108111f20, methodID=0x9) at jni.h:1516
#24 0x00000001000c60a9 in JNIEnv_::ExceptionCheck () at /Users/swade/compile/openjdk/build/macosx-x86_64/j2sdk-image/include/jni.h:282
(gdb) 

swadey avatar Aug 11 '13 01:08 swadey

Thanks, @swadey.

With respect to the first test (Misc), it looks like there's some problem with the encoding of non-ascii characters, either in Avian itself or perhaps on the file system. I'm curious if you have Misc$μClass.class in build/darwin-x86_64-openjdk-src/test. It might also be interesting to turn on DebugFind and DebugStat in finder.cpp.

A few things jump out at me in the Datagrams case. First, that the stack trace mentions the IPv6 implementation, rather than IPv4. I wonder if it hasn't been exercised before, and that's why I'm not able to reproduce the problem. Perhaps disabling IPv6 either on the system or in the code could confirm this.

The stack trace is a bit weird; I suspect that optimization might be corrupting it. Building with mode=debug should help this. If the stack trace can be trusted, it looks like the "field" in SetIntField is probably what's causing the segfault (i.e. it's null). It would be good to confirm this both in setIntField and and in Java_java_net_Inet6AddressImpl_lookupAllHostAddr, in the openjdk code.

joshuawarner32 avatar Aug 11 '13 02:08 joshuawarner32

On Sat, 10 Aug 2013, S Wade wrote:

Running the two failing tests yields:

[splunk:~/compile/avian]⇒ ./build/darwin-x86_64-openjdk-src/avian -cp ./bui ld/darwin-x86_64-openjdk-src/test Datagrams [1] 63564 abort ./build/darwin-x86_64-openjdk-src/avian -cp Datagra ms [splunk:~/compile/avian]⇒ ./build/darwin-x86_64-openjdk-src/avian -cp ./bui ld/darwin-x86_64-openjdk-src/test Misc java.lang.RuntimeException at Misc.syncStatic(Misc.java:73) at Misc.main(Misc.java:171) Sat Aug 10 21:42:11 EDT 2013 x true 42 123456789012345 75.62 75.62 hi java/lang/NoClassDefFoundError: Misc$?Class at Misc.main (line 249)

Looks like an issue with how the VM is passing filenames with Unicode characters to I/O system calls and/or libc calls. We had issues with this on Windows, but this is the first time I've seen it on OS X. What is $LANG set to on your system? If it's not e.g. eng_US.UTF-8, could you try setting it to that and try running the test again?

Datagrams causes the abort. So I ran that in GDB. See the attached stack trace:

(gdb) set args -cp ./build/darwin-x86_64-openjdk-src/test Datagrams (gdb) r Starting program: /Users/swade/compile/avian/build/darwin-x86_64-openjdk-src /avian -cp ./build/darwin-x86_64-openjdk-src/test Datagrams Reading symbols for shared libraries ++++++................................. ........................... done Reading symbols for shared libraries ....................................... ................................................. done

Program received signal EXC_BAD_ACCESS, Could not access memory. Reason: KERN_INVALID_ADDRESS at address: 0x000000000000001a setIntField (t=0x104808c00, arguments=0x7fff5fbfeb30) at machine.h:3671 3671 if (UNLIKELY(fieldFlags(t, field) & ACC_VOLATILE)) { (gdb) bt #0 setIntField (t=0x104808c00, arguments=0x7fff5fbfeb30) at machine.h:3671 #1 0x0000000100091355 in vmRun () #2 0x000000010007274a in vm::Thread::Checkpoint::~Checkpoint () at /Users/s wade/compile/avian/src/avian/machine.h:1950 #3 0x000000010007274a in vm::Thread::RunCheckpoint::~RunCheckpoint () at /U sers/swade/compile/avian/src/avian/machine.h:1455 #4 0x000000010007274a in vm::run () at /Users/swade/compile/avian/src/avian /machine.h:1950 #5 0x000000010007274a in SetIntField (t=0x104808c00, o=0x7fff5fbfeb30, fiel d=1, v=116396736) at jnienv.cpp:1881 #6 0x00000001000b6695 in Java_java_net_Inet6AddressImpl_lookupAllHostAddr ( env=0x104808c00, this=0x106f012c0, host=0x7fff5fbfed10) at Inet6AddressImpl. c:350

That's helpful, but could you try it again after building Avian with mode=debug (and running ./build/darwin-x86_64-debug-openjdk-src/test instead)?

dicej avatar Aug 11 '13 02:08 dicej

@joshuawarner32 and @dicej

Thanks! I see that \mu is not getting encoded properly. It's coming up as '?'. my LANG setting is set to 'C'. I still have problems when I set LANG to 'en_US.UTF-8'.

Here's the result with debug on:

Starting program: /Users/swade/compile/avian/build/darwin-x86_64-debug-openjdk-src/avian -cp ./build/darwin-x86_64-debug-openjdk-src/test Datagrams
Reading symbols for shared libraries . done

Program received signal SIGABRT, Aborted.
0x00007fff93b05212 in __pthread_kill ()
(gdb) bt
#0  0x00007fff93b05212 in __pthread_kill ()
#1  0x00007fff953f3b54 in pthread_kill ()
#2  0x00007fff95437dce in abort ()
#3  0x000000010000198d in abort (this=0x104507f90) at posix.cpp:934
#4  0x0000000100036dff in avian::util::abort<vm::Thread*> (t=0x104808e08) at abort.h:24
#5  0x0000000100036e43 in avian::util::expect<vm::Thread*> (t=0x104808e08, v=false) at abort.h:31
#6  0x0000000100036e79 in avian::util::assert<vm::Thread*> (t=0x104808e08, v=false) at abort.h:40
#7  0x00000001000b1fe4 in getField (t=0x104808e08, f=0) at jnienv.cpp:1535
#8  0x00000001000ad405 in setIntField (t=0x104808e08, arguments=0x7fff5fbfe5e8) at jnienv.cpp:1863
#9  0x00000001000e1600 in vmRun ()
#10 0x0000000100037a60 in vm::runRaw (t=0x104808e08, function=0x1000ad3d0 <setIntField>, arguments=0x7fff5fbfe5e8) at machine.h:1950
#11 0x0000000100037ad2 in vm::run (t=0x104808e08, function=0x1000ad3d0 <setIntField>, arguments=0x7fff5fbfe5e8) at machine.h:1957
#12 0x00000001000ad3c6 in SetIntField (t=0x104808e08, o=0x104521a48, field=0, v=1) at jnienv.cpp:1881
#13 0x0000000100122ae8 in Java_java_net_Inet6AddressImpl_lookupAllHostAddr (env=0x104808e08, this=0x7fff5fbfed10, host=0x7fff5fbfed18) at Inet6AddressImpl.c:350
#14 0x00000001000e158f in .Lcall () at finder.h:121
#15 0x0000000100004582 in vm::dynamicCall (function=0x100121e90, arguments=0x7fff5fbfea70, argumentTypes=0x7fff5fbfea60 "\a\a\a\004\001", argumentCount=3, unnamed_arg=24, returnType=7) at x86.h:197
#16 0x0000000100001ea1 in call (this=0x104507f90, function=0x100121e90, arguments=0x7fff5fbfea70, types=0x7fff5fbfea60 "\a\a\a\004\001", count=3, size=24, returnType=7) at posix.cpp:782
#17 0x00000001000879de in invokeNativeSlow (t=0x104808e08, method=0x104c09528, function=0x100121e90) at compile.cpp:7778
#18 0x000000010008853d in invokeNative2 (t=0x104808e08, method=0x104c09528) at compile.cpp:7850
#19 0x000000010006f1ce in invokeNative (t=0x104808e08) at compile.cpp:7882
#20 0x000000010500007b in ?? ()
#21 0x000000010008664f in invoke (thread=0x104808e08, method=0x104bc0728, arguments=0x7fff5fbff218) at compile.cpp:8669
#22 0x000000010006e905 in invokeList (this=0x10450ac28, t=0x104808e08, method=0x104bc0728, this_=0x0, indirectObjects=true, arguments=0x7fff5fbff650) at compile.cpp:9170
#23 0x00000001000aeb17 in callStaticVoidMethodV (t=0x104808e08, arguments=0x7fff5fbff4a0) at jnienv.cpp:1434
#24 0x00000001000e1600 in vmRun () at alloc-vector.h:102
#25 0x0000000100037a60 in vm::runRaw (t=0x104808e08, function=0x1000aea70 <callStaticVoidMethodV>, arguments=0x7fff5fbff4a0) at machine.h:1950
#26 0x0000000100037ad2 in vm::run (t=0x104808e08, function=0x1000aea70 <callStaticVoidMethodV>, arguments=0x7fff5fbff4a0) at machine.h:1957
#27 0x00000001000aea65 in CallStaticVoidMethodV (t=0x104808e08, unnamed_arg=0x104508768, m=9, a=0x7fff5fbff650) at jnienv.cpp:1444
#28 0x000000010013d5d0 in JNIEnv_::CallStaticVoidMethod (this=0x104808e08, cls=0x104508768, methodID=0x9) at jni.h:1516
#29 0x000000010013cd87 in main (ac=4, av=0x7fff5fbff830) at main.cpp:282

swadey avatar Aug 11 '13 03:08 swadey

I'm not sure where to go with the Misc test from here. My hunch is that it wouldn't be reproducable on a fresh OS install. I wonder if the standard Java VM (hotspot) would work in that scenario. You should be able to run Misc directly with the 'java' command.

With respect to the Datagrams test, it'd be good to see how that field id becomes 0 (which is apparently an invalid field id). Looking at the openjdk code, it seems that's only set in Java_java_net_Inet6Address_init. That method is probably either never called, the value Avian is giving for that field is actually 0. Which of these is the case would be telling (gdb to the rescue!).

joshuawarner32 avatar Aug 12 '13 23:08 joshuawarner32

@joshuawarner32 I'm having the same failure on the File test as you note above. The exact same problem arises when I try to build my code with an embedded JVM.

I was wondering whether you had figured out a fix?

Thanks.

nyjle avatar Aug 26 '13 17:08 nyjle

@nyjle, I looked briefly into the problem and didn't find anything. At least now we know that it's not a one-off problem. I'll take another look at it.

joshuawarner32 avatar Aug 26 '13 18:08 joshuawarner32

@joshuawarner32, let me know if I can help. Thanks.

nyjle avatar Aug 26 '13 20:08 nyjle

@nyjle , I found the Files test issue (see #76). Give that patch a try.

joshuawarner32 avatar Aug 27 '13 03:08 joshuawarner32

@joshuawarner32 That works for me. Thanks a lot!

nyjle avatar Aug 27 '13 04:08 nyjle

I've tried putting a breakpoint on Java_java_net_Inet6Address_init - it's never called.

csoren avatar Sep 17 '13 06:09 csoren

This could have been fixed with https://github.com/ReadyTalk/avian/pull/223. @csoren / @nyjle, if you get a chance, could you see if you can reproduce this?

joshuawarner32 avatar Apr 15 '14 21:04 joshuawarner32