snabb icon indicating copy to clipboard operation
snabb copied to clipboard

Flakiness in the tap app selftest

Open takikawa opened this issue 7 years ago • 0 comments

As @wingo referenced in #1316, the tap app selftest is sometimes failing. The test seems to be flaky on both the next and raptorjit branches too. On next, a failing test run has a -jv dump like this one (note fallback to interpreter in tap.lua):

[TRACE   4 snsh.lua:31 stitch C:44fe80]
[TRACE   5 memory.lua:26 return]
[TRACE   6 packet.lua:197 loop]
[TRACE   7 helpers.lua:64 return]
[TRACE   8 (7/0) helpers.lua:65 -> 6]
[TRACE --- c.lua:509 -- leaving loop in root trace at syscalls.lua:186]
[TRACE   9 (8/1) methods.lua:150 stitch string.match]
[TRACE --- c.lua:27 -- leaving loop in root trace at c.lua:231]
[TRACE --- (9/stitch) shm.lua:71 -- NYI: bytecode 51 at shm.lua:93]
[TRACE  10 header.lua:241 return]
[TRACE --- (9/1) methods.lua:150 -- NYI: unsupported variant of FastFunc ffi.typeof at shm.lua:47]
[TRACE --- (9/1) methods.lua:150 -- NYI: unsupported variant of FastFunc ffi.typeof at shm.lua:47]
[TRACE --- (9/1) methods.lua:150 -- NYI: unsupported variant of FastFunc ffi.typeof at shm.lua:47]
[TRACE --- (9/1) methods.lua:150 -- NYI: unsupported variant of FastFunc ffi.typeof at shm.lua:47]
[TRACE --- c.lua:27 -- leaving loop in root trace at c.lua:543]
[TRACE  11 (9/1) methods.lua:150 -- fallback to interpreter]
[TRACE --- types.lua:101 -- leaving loop in root trace at syscalls.lua:60]
[TRACE  12 syscalls.lua:53 return]
[TRACE --- shm.lua:105 -- leaving loop in root trace at shm.lua:46]
[TRACE  13 shm.lua:90 return]
[TRACE  14 builtin:remove return]
[TRACE --- (9/5) syscalls.lua:67 -- NYI: unsupported variant of FastFunc ffi.typeof at shm.lua:47]
[TRACE --- (9/5) syscalls.lua:67 -- NYI: unsupported variant of FastFunc ffi.typeof at shm.lua:47]
[TRACE --- (9/5) syscalls.lua:67 -- NYI: unsupported variant of FastFunc ffi.typeof at shm.lua:47]
[TRACE --- (9/5) syscalls.lua:67 -- NYI: unsupported variant of FastFunc ffi.typeof at shm.lua:47]
[TRACE  15 (9/5) syscalls.lua:67 -- fallback to interpreter]
[TRACE  16 syscalls.lua:59 return]
[TRACE  17 shm.lua:50 stitch string.match]
[TRACE --- (17/stitch) shm.lua:71 -- NYI: bytecode 51 at shm.lua:93]
[TRACE --- c.lua:513 -- leaving loop in root trace at syscalls.lua:80]
[TRACE --- c.lua:230 -- leaving loop in root trace at syscalls.lua:148]
[TRACE --- c.lua:509 -- leaving loop in root trace at syscalls.lua:186]
[TRACE --- c.lua:166 -- leaving loop in root trace at syscalls.lua:182]
[TRACE  18 syscalls.lua:146 return]
[TRACE --- types.lua:101 -- leaving loop in root trace at syscalls.lua:149]
[TRACE  19 app.lua:327 return]
[TRACE  20 histogram.lua:92 loop]
[TRACE --- synth.lua:40 -- leaving loop in root trace at synth.lua:43]
[TRACE  21 packet.lua:84 return]
[TRACE --- synth.lua:40 -- leaving loop in root trace at synth.lua:43]
[TRACE  22 packet.lua:106 return]
[TRACE  23 synth.lua:39 loop]
[TRACE  24 ethernet.lua:86 return]
[TRACE  25 syscalls.lua:107 return]
[TRACE  26 counter.lua:91 return]
[TRACE --- (26/0) counter.lua:91 -- loop unroll limit reached at tap.lua:201]
[TRACE --- (26/0) counter.lua:91 -- loop unroll limit reached at tap.lua:201]
[TRACE --- (26/0) counter.lua:91 -- loop unroll limit reached at tap.lua:201]
[TRACE --- (26/0) counter.lua:91 -- loop unroll limit reached at tap.lua:201]
[TRACE  27 (26/0) counter.lua:91 -- fallback to interpreter]
[TRACE  28 tap.lua:198 loop]
[TRACE --- packet.lua:35 -- leaving loop in root trace at packet.lua:171]
[TRACE  29 link.lua:48 return]
[TRACE  30 packet.lua:169 return]
[TRACE --- lib.lua:201 -- leaving loop in root trace at match.lua:53]
[TRACE  31 match.lua:51 loop]
[TRACE  32 match.lua:44 stitch print]
[TRACE  33 (32/stitch) match.lua:45 -> 32]
[TRACE  34 StackTracePlus.lua:142 stitch io.method.read]
[TRACE  35 (34/stitch) StackTracePlus.lua:143 -> 34]
[TRACE  36 StackTracePlus.lua:95 return]
[TRACE --- StackTracePlus.lua:233 -- NYI: bytecode 72 at StackTracePlus.lua:251]
[TRACE  37 (36/0) StackTracePlus.lua:96 return]

and a successful dump looks like this:

[TRACE   4 lib.lua:187 stitch string.gmatch]
[TRACE   5 packet.lua:197 loop]
[TRACE   6 helpers.lua:64 return]
[TRACE   7 (6/0) helpers.lua:65 stitch string.match]
[TRACE   8 syscalls.lua:59 return]
[TRACE --- c.lua:27 -- leaving loop in root trace at c.lua:168]
[TRACE --- (7/stitch) shm.lua:71 -- NYI: bytecode 51 at shm.lua:93]
[TRACE   9 (8/1) syscalls.lua:61 stitch string.match]
[TRACE  10 packet.lua:56 return]
[TRACE --- (4/stitch) lib.lua:188 -- NYI: bytecode 51 at lib.lua:192]
[TRACE --- ethernet.lua:48 -- leaving loop in root trace at ethernet.lua:56]
[TRACE --- (7/1) helpers.lua:65 -- NYI: unsupported variant of FastFunc ffi.typeof at shm.lua:47]
[TRACE --- c.lua:27 -- leaving loop in root trace at c.lua:179]
[TRACE --- (7/1) helpers.lua:65 -- NYI: unsupported variant of FastFunc ffi.typeof at shm.lua:47]
[TRACE --- (7/1) helpers.lua:65 -- NYI: unsupported variant of FastFunc ffi.typeof at shm.lua:47]
[TRACE --- (9/1) syscalls.lua:61 -- NYI: unsupported variant of FastFunc ffi.typeof at shm.lua:47]
[TRACE --- (9/1) syscalls.lua:61 -- NYI: unsupported variant of FastFunc ffi.typeof at shm.lua:47]
[TRACE --- (9/1) syscalls.lua:61 -- NYI: unsupported variant of FastFunc ffi.typeof at shm.lua:47]
[TRACE --- c.lua:543 -- leaving loop in root trace at syscalls.lua:216]
[TRACE --- (9/1) syscalls.lua:61 -- NYI: unsupported variant of FastFunc ffi.typeof at shm.lua:47]
[TRACE  11 helpers.lua:57 return]
[TRACE  12 (9/1) syscalls.lua:61 -- fallback to interpreter]
[TRACE --- (7/1) helpers.lua:65 -- NYI: unsupported variant of FastFunc ffi.typeof at shm.lua:47]
[TRACE  13 shm.lua:90 return]
[TRACE  14 syscalls.lua:53 return]
[TRACE  15 (7/1) helpers.lua:65 -- fallback to interpreter]
[TRACE --- (7/7) methods.lua:147 -- NYI: unsupported variant of FastFunc ffi.typeof at shm.lua:47]
[TRACE --- (7/7) methods.lua:147 -- NYI: unsupported variant of FastFunc ffi.typeof at shm.lua:47]
[TRACE --- (7/7) methods.lua:147 -- NYI: unsupported variant of FastFunc ffi.typeof at shm.lua:47]
[TRACE --- (7/7) methods.lua:147 -- NYI: unsupported variant of FastFunc ffi.typeof at shm.lua:47]
[TRACE  16 (7/7) methods.lua:147 -- fallback to interpreter]
[TRACE --- c.lua:509 -- leaving loop in root trace at syscalls.lua:186]
[TRACE  17 methods.lua:142 return]
[TRACE  18 ethernet.lua:48 stitch string.match]
[TRACE  19 (18/stitch) ethernet.lua:49 return]
[TRACE --- c.lua:27 -- leaving loop in root trace at c.lua:509]
[TRACE --- c.lua:230 -- leaving loop in root trace at syscalls.lua:148]
[TRACE --- app.lua:316 -- inner loop in root trace at link.lua:32]
[TRACE  20 shm.lua:70 stitch string.match]
[TRACE --- (20/stitch) shm.lua:71 -- NYI: bytecode 51 at shm.lua:93]
[TRACE  21 syscalls.lua:486 return]
[TRACE  22 counter.lua:49 -> 20]
[TRACE --- c.lua:184 -- leaving loop in root trace at shm.lua:73]
[TRACE  23 syscalls.lua:146 return]
[TRACE  24 histogram.lua:92 loop]
[TRACE  25 packet.lua:84 return]
[TRACE  26 packet.lua:106 return]
[TRACE  27 counter.lua:91 return]
[TRACE --- (27/0) counter.lua:91 -- loop unroll limit reached at synth.lua:41]
[TRACE --- (27/0) counter.lua:91 -- loop unroll limit reached at synth.lua:41]
[TRACE --- synth.lua:40 -- leaving loop in root trace at synth.lua:43]
[TRACE --- (27/0) counter.lua:91 -- loop unroll limit reached at synth.lua:41]
[TRACE --- (27/0) counter.lua:91 -- loop unroll limit reached at synth.lua:41]
[TRACE  28 (27/0) counter.lua:91 -- fallback to interpreter]
[TRACE  29 link.lua:81 return]
[TRACE --- synth.lua:40 -- leaving loop in root trace at synth.lua:43]
[TRACE  30 synth.lua:39 loop]
[TRACE  31 builtin:remove return]
[TRACE  32 link.lua:62 return]
[TRACE  33 packet.lua:169 return]
[TRACE  34 ethernet.lua:86 return]
[TRACE --- c.lua:27 -- leaving loop in root trace at c.lua:515]
[TRACE  35 ethernet.lua:80 return]
[TRACE  36 tap.lua:198 loop]
[TRACE  37 histogram.lua:47 return]
[TRACE --- c.lua:514 -- leaving loop in root trace at syscalls.lua:96]
[TRACE  38 tap.lua:172 loop]
[TRACE  39 link.lua:76 return]
[TRACE  40 match.lua:26 loop]
[TRACE  41 (30/19) synth.lua:37 -> 30]
[TRACE  42 lib.lua:249 return]
[TRACE --- app.lua:520 -- leaving loop in root trace at app.lua:529]
[TRACE  43 (36/31) link.lua:77 -> 40]
[TRACE  44 (38/25) tap.lua:194 -> 36]
[TRACE  45 (40/37) link.lua:77 -> 30]
[TRACE  46 (41/9) app.lua:521 -> 38]
[TRACE  47 (46/9) lib.lua:254 -> 38]
[TRACE  48 counter.lua:85 loop]

I took a look at a failing test run auditlog using Studio too, and saw that the loop unroll limit trace abort in tap.lua happens in there too. So maybe this is an interesting case of some non-deterministic performance for Raptorjit to fix.

In the meantime though, it's possible to make the test less likely to fail by increasing the test duration (0.01 to 0.05 timeout works fine) so that even if it falls back to the interpreter, it can keep up.

takikawa avatar Apr 07 '18 21:04 takikawa