engine.js icon indicating copy to clipboard operation
engine.js copied to clipboard

Random end-to-end test failures

Open ericallam opened this issue 13 years ago • 3 comments

On multiple runs of the end-to-end tests, I am getting random errors. Here is a sample of some of the test runs:

$ make end-to-end-test
`npm bin`/jasmine-node spec/end-to-end/
......
timers.js:96
            if (!process.listeners('uncaughtException').length) throw e;
                                                                      ^
TypeError: Socket is closed
    at Socket._ioevents (/Users/eric/CodePath/engine.js/node_modules/zmq/lib/index.js:146:22)
    at Socket._flush (/Users/eric/CodePath/engine.js/node_modules/zmq/lib/index.js:277:23)
    at Socket.send (/Users/eric/CodePath/engine.js/node_modules/zmq/lib/index.js:255:42)
    at [object Object].<anonymous> (/Users/eric/CodePath/engine.js/lib/engine/cylinder.js:75:37)
    at [object Object].emit (events.js:64:17)
    at Object._onTimeout (/Users/eric/CodePath/engine.js/lib/engine/cylinder/execution_watcher.js:19:18)
    at Timer.ontimeout (timers.js:94:19)
make: *** [end-to-end-test] Error 1
$ make end-to-end-test
`npm bin`/jasmine-node spec/end-to-end/
.........................

Finished in 6.452 seconds
25 tests, 33 assertions, 0 failures

$ make end-to-end-test
`npm bin`/jasmine-node spec/end-to-end/
.........
/Users/eric/CodePath/engine.js/lib/engine/intake.js:27
        if (err) throw err;
                       ^
Error: Address already in use
make: *** [end-to-end-test] Error 1
$ make end-to-end-test
`npm bin`/jasmine-node spec/end-to-end/
.........F...............

Failures:

  1) outputs console messages
   Message:
     timeout: timed out after 5000 msec waiting for something to happen
   Stacktrace:
     undefined

Finished in 11.866 seconds
25 tests, 33 assertions, 1 failure

^Cmake: *** [end-to-end-test] Error 1

Have you seen any of these during test runs? Should I isolate each one and create an issue for each?

Here is my system details:

  • Mac OS X 10.7.3
  • node 0.6.14
  • zeromq 2.1.11

ericallam avatar Apr 26 '12 23:04 ericallam

Thanks for reporting. I am looking into this now.

I suspect a timing issue somewhere in the library and/or in the 0mq binding.

rehanift avatar Apr 27 '12 04:04 rehanift

I made 2 changes and pushed to a new branch: 22846b10ce2d985d4701e085228073405641e0ef

First, I made socket binding synchronous. Next I put a sleep in between each test. I don't think this is an ultimate solution, but it should help expose race conditions.

I'll add some more detailed logging to try and nail it down soon, shouldn't be too hard.

rehanift avatar Apr 27 '12 06:04 rehanift

I did some more research into this error:

timers.js:96
            if (!process.listeners('uncaughtException').length) throw e;
                                                                      ^
TypeError: Socket is closed
    at Socket._ioevents (/Users/eric/CodePath/engine.js/node_modules/zmq/lib/index.js:146:22)
    at Socket._flush (/Users/eric/CodePath/engine.js/node_modules/zmq/lib/index.js:277:23)
    at Socket.send (/Users/eric/CodePath/engine.js/node_modules/zmq/lib/index.js:255:42)
    at [object Object].<anonymous> (/Users/eric/CodePath/engine.js/lib/engine/cylinder.js:75:37)
    at [object Object].emit (events.js:64:17)
    at Object._onTimeout (/Users/eric/CodePath/engine.js/lib/engine/cylinder/execution_watcher.js:19:18)
    at Timer.ontimeout (timers.js:94:19)
make: *** [end-to-end-test] Error 1

When a Task is run through a Cylinder an ExecutionWatcher object starts (which calls setTimeout under the hood). When the Task completes the Cylinder stops the ExecutionWatcher (which calls clearTimeout under the hood).

It seems like sometimes the clearTimeout is not working and the Cylinder is attempting to kill the Piston, but the automated test has finished (and passed) and all the components (Cylinder, Piston, Exhaust, etc.) have closed and reset for the next automated test.

I did some research into known clearTimeout issues with Node, and it looks like there has been some issues recently. I'll try and devise a test that reproduces this issue and submit it to the Node team. In the meantime, Im thinking about doing two things on the Engine.js side:

  • Write some more end-to-end tests to verify correct ExecutionWatcher behavior
  • Trap this condition to prevent the error from blowing everything up.

rehanift avatar Apr 28 '12 19:04 rehanift