jsource icon indicating copy to clipboard operation
jsource copied to clipboard

Fix unit tests from live stream #1

Open codereport opened this issue 4 years ago • 16 comments

image

codereport avatar Jan 16 '21 07:01 codereport

Looks like there are four main categories:

  • E.
  • Large matrices
  • Random # ~ y
  • Segfaults

codereport avatar Jan 24 '21 00:01 codereport

This is the updated list: image

Also, recently the following were commented out:

  • gmmf
  • g430
  • g420

codereport avatar Jan 26 '21 05:01 codereport

gstack fails on my computer in 200ms. Causes a SegFault. This is running the test in debug mode.

383: Test command: /mnt/d/dev/jsource/cmake-build-debug/jsrc/jconsole "gstack.ijs"
383: Test timeout computed to be: 1500
383: Hello YouTube Viewers, all 22 of you!
383: see: tsu_notes, tsu_usage, tsu_pacman, and tsu_jd
383: 
383:    RUN  ddall  NB. report scripts that fail
383:    RECHO ddall NB. echo script names as run and final count of failures
383: 
383: j902/j64/linux/beta/GPL3/unknown/2021-01-26T20:43:34/gcc-10-2
Exception: SegFault
Program received signal SIGSEGV, Segmentation fault.
0x00007ffffec9ec10 in jtgaf (jt=0x0, blockx=0) at /mnt/d/dev/jsource/jsrc/m.c:905
905	RESTRICTF A jtgaf(J jt,I blockx){A z;I mfreeb;I n=(I)2<<blockx;  // n=size of allocated block

Sebanisu avatar Jan 27 '21 01:01 Sebanisu

@Sebanisu Are you on osx or linux? Does ctest work?

codereport avatar Jan 27 '21 02:01 codereport

I'm on windows 10 with wsl on Ubuntu 20.4. I'm running the test from the Clion IDE. All other tests that are enabled are passing. I can run the tests from console if you want. I just don't know the command line for ctest. Also I have gcc 10.2 and cmake 19.

Sebanisu avatar Jan 27 '21 09:01 Sebanisu

I'm on windows 10 with wsl on Ubuntu 20.4. I'm running the test from the Clion IDE. All other tests that are enabled are passing. I can run the tests from console if you want. I just don't know the command line for ctest. Also I have gcc 10.2 and cmake 19.

ninja -C build test is the command (with -j$(nproc) if you want)

codereport avatar Jan 27 '21 11:01 codereport

Do flaky tests also fall under this issue?

The g001 seems to be timing issue (see https://github.com/codereport/jsource/pull/39#issuecomment-767680956), which get worse with more load on the system (at least on my machine).

juntuu avatar Jan 27 '21 15:01 juntuu

Do flaky tests also fall under this issue?

The g001 seems to be timing issue (see #39 (comment)), which get worse with more load on the system (at least on my machine).

Yea, g001 is probably the most flaky for me, g430/1a also pretty flaky. I ended up commenting out all of the gmmf* tests, maybe we should just do the same for these ones.

codereport avatar Jan 27 '21 15:01 codereport

I identified a problem with the g420 test:

The failing testcases were testing fold, starting from line 730

NB. Fold  F. F.. F.: F: F:. F:: ------------------------------------------------------------------
NB. monad
10 -: ] F.. + i. 5

The failure comes from foldr.ijs not been found, and so all the folds fail. Trying the first line in jconsole gave a "file not found error"

Hello YouTube Viewers, all 22 of you!
   10 -: ] F.. + i. 5
not found: /.../jsource/jlibrary/addons/dev/fold/foldr.ijs
|nonce error
|   10-:]    F..+i.5

So, it seems to looks for fold in jlibrary/addons/dev/fold/foldr.ijs when the code is in jlibrary/dev/fold/foldr.ijs.

The loading is done in adverbs/ar.c:656

  switch(step){  // try the startup, from the bottom up
  case 1: eval("load'~addons/dev/fold/foldr.ijs'");  // fall through
  case 0: if((foldconj=nameref(nfs(8,"Foldr_j_"),jt->locsyms))&&AT(foldconj)&CONJ)goto found;  // there is always a ref, but it may be to [:
  }

I got the tests to pass by moving the jlibrary/dev folder to jlibrary/addons/dev, but I'm not sure if this is the right solution, or if the root cause is some other misconfiguration.

Maybe someone who knows more about the j system, and how it locates and loads files, could chime in.

juntuu avatar Jan 27 '21 22:01 juntuu

ninja -C build test is the command (with -j$(nproc) if you want)

Oh it's in the CONTRIBUTING.md. 🙃 I never used ninja till now. It SegFault'ed via the console with ninja as well.

383/397 Test #383: gstack ...........................***Exception: SegFault  0.16 sec

Sebanisu avatar Jan 28 '21 01:01 Sebanisu

gstack fails on my computer in 200ms. Causes a SegFault. This is running the test in debug mode.

I'm on windows 10 with wsl on Ubuntu 20.4.

At the top of the test file, there is a comment about stack size issues

NB. stack issues (function call limit) ----------------------------------

0 0 $  0 : 0
The recursion limit is constrained by the stack size available to 
the J executable file. Crashes due to stack errors can be overcome 
by increasing the stack size. Under Windows, the stack size can be 
queried and set as follows:
...

AFAIK windows gives programs less stack by default, compared to linux/mac, so it could be stack overflow. I'm not sure how running in wsl affects the stack size though.

juntuu avatar Jan 28 '21 10:01 juntuu

Found a promising looking commit https://github.com/jsoftware/jsource/commit/6adb504eb1d4800f1d6d327c1d67c13060589ddf for the g001 "issue".

Then looking for "THRESHOLD" in tsu.ijs I found this

THRESHOLD=: 0 NB. allow timing tests to trigger failure 
THRESHOLD=: 1 NB. force timing tests to pass

and this

tsu_notes=: 0 : 0
many scripts have timing tests
typically comparing timing/result of j vs j model
these tests can be essential for new/changed code
but vary greatly across environments
and can result in false failures

THRESHOLD set to 1 by default - ignores timing test failures
threshold set to 0.2 for loose tests (0.75 for tighter tests)

THRESHOLD should be applied as false failures are discovered

gfft/glapack not in ddall - run separately with: RUN1'gfft'
g18x fails on subsequent runs - no idea why
g401 occasionally fails (random data?) but then runs clean
)

juntuu avatar Jan 28 '21 10:01 juntuu

@juntuu I merged this: https://github.com/codereport/jsource/pull/64, hopefully helps

codereport avatar Jan 28 '21 16:01 codereport

@juntuu I merged this: #64, hopefully helps

I didn't check where the lowercase threshold was used, but the uppercase THRESHOLD seemed to be by default 1 (i.e. force timing tests to succeed).

If you check the commit I linked from the jsoftware repo, it just added THRESHOLD +. in front of these test cases. I guess the threshold can then be swithed to 0, if the timing needs to be tested at some point.

Maybe we could cherry-pick the commit and add it here. I'll add PR for that.

juntuu avatar Jan 28 '21 16:01 juntuu

This seems correct. I just realized I could run the test with valgrind. And it gave some more details on the stack overflow.

383: ==2072== Stack overflow in thread #1: can't grow stack to 0x1ffe801000
383: ==2072== Stack overflow in thread #1: can't grow stack to 0x1ffe801000
383: ==2072==  If you believe this happened as a result of a stack
383: ==2072==  overflow in your program's main thread (unlikely but
383: ==2072==  possible), you can try to increase the size of the
383: ==2072==  main thread stack using the --main-stacksize= flag.
383: ==2072==  The main thread stack size used in this run was 16777216.

image

Maybe running valgrind on other tests with segfaults could help diagnose issues with them.

Sebanisu avatar Jan 29 '21 03:01 Sebanisu

I uncommented all the tests and only have the following failing.

image

codereport avatar Jan 29 '21 03:01 codereport