openwhisk icon indicating copy to clipboard operation
openwhisk copied to clipboard

AkkaContainerClient breaks the ConcurrentTests of nodejs runtime

Open style95 opened this issue 2 years ago • 2 comments

The concurrent tests of nodejs runtime keep failing.

I got into the CI environment and found the below.

The nodejs action code for the test is supposed to print the log whenever it is invoked.

global.count = 0;
let requestCount = $requestCount;
let interval = 1000;
function main(args) {
    global.count++;
    console.log("interleave me");    //////////// this log
    return new Promise(function(resolve, reject) {
        setTimeout(function() {
            checkRequests(args, resolve, reject);
        }, interval);
   });
}
function checkRequests(args, resolve, reject, elapsed) {
    let elapsedTime = elapsed||0;
    if (global.count == requestCount) {
        resolve({ args: args});
    } else {
        if (elapsedTime > 30000) {
            reject("did not receive "+requestCount+" activations within 30s");
        } else {
            setTimeout(function() {
                checkRequests(args, resolve, reject, elapsedTime+interval);
            }, interval);
        }
    }
}
...
interleave me
...

So we can compare how many requests arrived to the container by counting the number of that log. The test is sending 128 concurrent requests, 128 logs are supposed to be printed. But I only found around 60~65 logs.

And I added a few logs to the AkkaContainerClient and found the below logs.

runtime.actionContainers.NodeJs20ConcurrentTests > action-nodejs-v20 should allow running activations concurrently STANDARD_OUT
    running 128 requests
    [2023-10-28T01:34:45.439423] call to /init
    [2023-10-28T01:34:45.588940] call to /run
    [2023-10-28T01:34:45.589216] call to /run
    [2023-10-28T01:34:45.589413] call to /run
    [2023-10-28T01:34:45.589566] call to /run
    [2023-10-28T01:34:45.589669] call to /run
    [2023-10-28T01:34:45.589Z] [WARN] [#tid_sid_unknown] [AkkaContainerClient] The queue has already been completed.
    [2023-10-28T01:34:45.589745] call to /run
.
.
.
    [2023-10-28T01:34:45.644328] call to /run
    [2023-10-28T01:34:45.644504] call to /run
    [2023-10-28T01:35:16.039219] timeout to /run
    [2023-10-28T01:35:16.039403] timeout to /run

It seems the underlying request queue of AkkaContainerClient is closed after sending just a couple of requests.

Environment details:

Steps to reproduce the issue:

  1. Run the NodeJS Runtime CI workflow.

style95 avatar Oct 28 '23 01:10 style95

@joni-jones I suspect this is because of this line. Do you have any idea on this?

style95 avatar Oct 28 '23 01:10 style95

onCompletion should be called only when a sink has already processed all the messages. But from what I'm seeing in the issue it might not be the case. I will try to look more closely into those tests to try to understand if the changes introduced in #5442 did break something.

YevSent avatar Oct 28 '23 04:10 YevSent