nodeprof.js icon indicating copy to clipboard operation
nodeprof.js copied to clipboard

Callbacks for async/await

Open mwaldrich opened this issue 5 years ago • 19 comments

Would it be possible to implement new callbacks to instrument async and await?

For async, perhaps an argument could be added to the function call and function literal callbacks to indicate if it is an async function.

For await, I would suggest:

  • adding an isAwaited argument to the function call callbacks, indicating that the caller is awaiting this function's result.
  • a new await callback. This would allow the analysis to instrument when the await is scheduled.

These three changes together will allow analysis to distinguish between:

  • a regular function being invoked sequentially;
  • an async function being invoked sequentially (the value is not being awaited);
  • an async function being awaited.

Thanks!

mwaldrich avatar Jul 23 '19 15:07 mwaldrich

@mwaldrich Hi, thanks for the suggestion here.

For await, I don't think it is a good idea to bind it with a function call. await can be used on any value or promise, and await on an asynchronous function is actually await the internal promise returned by the async function (an async function always return a promise object, even though the function itself may return some non-promise value). So what I propose is to add two callback pairs like below:

For await awaitPre (iid, val) => where val could be any non-promise value or a promise object. awaitPost(iid, ret) => where ret will be the actually value resolved by the awaited promise or value.

For async, we can provide a special asyncFunctionExit event, which provide an extra argument of the promise object returned by the asynchronous function call: asyncFunctionExit(iid, ret, promise, exceptionWrapper)

With these two callbacks, you should be able to know if an async function has been awaited by tracking if any await is waiting for the same promiseObject returned by the async function.

I had some experimental implementation about the callbacks as above. However, it was based on my own hack on an old version of Graal.js. We will look into how to properly integrate it in Graal.js in future.

Feel free to discuss here what you think about these potential new callbacks.

Haiyang-Sun avatar Jul 24 '19 13:07 Haiyang-Sun

Sounds great! This should be enough to perform general instrumentation on async/await in JS.

How much work will it be to clean up the hacky implementation and get it merged into Graal?

mwaldrich avatar Jul 30 '19 20:07 mwaldrich

Any update on these callbacks?

mwaldrich avatar Sep 12 '19 20:09 mwaldrich

Async support has been merged upstream thanks to @eleinadani, await is still in progress.

alexjordan avatar Sep 13 '19 01:09 alexjordan

Thanks for the implementation! I've started playing around with these callbacks, but they don't seem to be called in the way I would expect.

I actually have two concerns:

  1. if I understand the purpose of the currently-implemented callbacks correctly, they are not being called at the right times
  2. I'm not sure the currently-implemented callbacks actually give us enough information.

Example (timing how long each await is)

Imagine you want to write a simple analysis to time how long each await is. You want to record a timestamp when a value is awaited, and another when the await is resolved.

Here's a simple analysis and sample program.

asyncAnalysis.js (analysis)

//DO NOT INSTRUMENT
(function (sandbox) {
    function MyAnalysis() {

        // this.promises = new Map();

        this.functionEnter = function (iid, f, base, args) {
            console.log("functionEnter@ " + J$.iidToLocation(iid), f.name, args.length);
        };
        this.functionExit = function(iid, returnVal, wrappedExceptionVal) {
            console.log("functionExit@ " + J$.iidToLocation(iid));
        };
        this.asyncFunctionEnter = function(iid) {
            console.log('async enter', J$.iidToLocation(iid));
        }
        this.asyncFunctionExit = function(iid, returnVal, wrappedException) {
            console.log('async exit', J$.iidToLocation(iid), returnVal, wrappedException);
        }

        this.awaitPre = function(iid, valAwaited) {
            console.log('awaitPre', J$.iidToLocation(iid), valAwaited);
        }
        this.awaitPost = function(iid, result, exceptionWrapper) {
            console.log('awaitPost', J$.iidToLocation(iid), result, exceptionWrapper);
        }

        this.binary = function(iid, op, left, right) {
            console.log('binary', op, left, right);
        }
        this.write = function(iid, name, val) {
            console.log('write', J$.iidToLocation(iid), name, val);
        }
        this._return = function(iid, val) {
            console.log('return', J$.iidToLocation(iid), val);
        }
        this.declare = function(iid, name, type) {
            console.log('declare', J$.iidToLocation(iid), name, type);
        };
    }

    sandbox.analysis = new MyAnalysis();
})(J$);

async.js (sample program)


async function countTo6(x) {
    if (x > 5) {
	      return x;
    } else {
        return await countTo6(x + 1);
    }
}

async function main() {
	  return await countTo6(1);
}

main().then(v => console.log("done! answer = " + v));

Analysis output

functionEnter@ (async.js:1:1:16:4)  0
functionExit@ (async.js:1:1:16:4)
declare (async.js:1:2:16:2) exports var
declare (async.js:1:2:16:2) require var
declare (async.js:1:2:16:2) module var
declare (async.js:1:2:16:2) __filename var
declare (async.js:1:2:16:2) __dirname var
declare (async.js:1:2:16:2) countTo6 var
declare (async.js:1:2:16:2) main var
functionEnter@ (async.js:1:2:16:2)  5
write (async.js:2:1:8:2) countTo6 async function countTo6(x) {
    if (x > 5) {
	      return x;
    } else {
        return await countTo6(x + 1);
    }
}
write (async.js:10:1:12:2) main async function main() {
	  return await countTo6(1);
}
async enter (async.js:11:4:11:29)
functionEnter@ (async.js:10:1:12:2) main 0
declare (async.js:2:1:8:2) x var
binary > 1 5
binary + 1 1
declare (async.js:2:1:8:2) x var
binary > 2 5
binary + 2 1
declare (async.js:2:1:8:2) x var
binary > 3 5
binary + 3 1
declare (async.js:2:1:8:2) x var
binary > 4 5
binary + 4 1
declare (async.js:2:1:8:2) x var
binary > 5 5
binary + 5 1
declare (async.js:2:1:8:2) x var
binary > 6 5
return (async.js:4:8:4:17) 6
awaitPre (async.js:6:16:6:37) Promise { 6 }
awaitPre (async.js:6:16:6:37) Promise { <pending> }
awaitPre (async.js:6:16:6:37) Promise { <pending> }
awaitPre (async.js:6:16:6:37) Promise { <pending> }
awaitPre (async.js:6:16:6:37) Promise { <pending> }
awaitPre (async.js:11:11:11:28) Promise { <pending> }
functionExit@ (async.js:10:1:12:2)
async exit (async.js:11:4:11:29) Promise { <pending> } {}
functionExit@ (async.js:1:2:16:2)
awaitPre (async.js:6:16:6:37) Promise { 6 }
awaitPost (async.js:6:16:6:37) 6 {}
return (async.js:6:9:6:38) 6
awaitPre (async.js:6:16:6:37) Promise { 6 }
awaitPost (async.js:6:16:6:37) 6 {}
return (async.js:6:9:6:38) 6
awaitPre (async.js:6:16:6:37) Promise { 6 }
awaitPost (async.js:6:16:6:37) 6 {}
return (async.js:6:9:6:38) 6
awaitPre (async.js:6:16:6:37) Promise { 6 }
awaitPost (async.js:6:16:6:37) 6 {}
return (async.js:6:9:6:38) 6
awaitPre (async.js:6:16:6:37) Promise { 6 }
awaitPost (async.js:6:16:6:37) 6 {}
return (async.js:6:9:6:38) 6
functionEnter@ (async.js:10:1:12:2) main 0
awaitPre (async.js:11:11:11:28) Promise { 6 }
awaitPost (async.js:11:11:11:28) 6 {}
return (async.js:11:4:11:29) 6
functionExit@ (async.js:10:1:12:2)
declare (async.js:14:13:14:52) v var
functionEnter@ (async.js:14:13:14:52)  1
binary + done! answer =  6
done! answer = 6
return (async.js:14:18:14:52) undefined
functionExit@ (async.js:14:13:14:52)

Issues

There are a few weird things happening in the analysis:

  1. functionEnter events are not being generated for recursive async calls
  2. asyncFunctionExit events are not being generated for recursive async calls
  3. functionExit events are not being generated after returning a real values from recursive async calls

In addition, I don't think it is possible to model the call stack from the analysis' perspective without also having an asyncFunctionEnter callback, which would be used when an async function is "resumed".

mwaldrich avatar Oct 18 '19 15:10 mwaldrich

I will look into the issue, thanks for the feedback!

Haiyang-Sun avatar Oct 18 '19 20:10 Haiyang-Sun

Also, a few more things I noticed:

  • there are too many awaitPre events being generated. ideally, there would be a one-to-one correspondence between awaitPre and awaitPost events, but there are exactly twice as many awaitPres compared to awaitPosts. this is because an additional awaitPre is generated when the await is resolved
  • I don't believe it's possible to tie together the awaitPres and awaitPosts unless:
    1. awaitPost also provides the original promise that awaitPre provided
    2. NodeProf ties a unique ID to each promise, and this ID is given in both awaitPre and awaitPost

mwaldrich avatar Oct 18 '19 21:10 mwaldrich

I have tried using the new asyncawait-fix branch, and it seems to solve the issue of too many awaitPre events being generated.

However, the awaitPost callback still does not provide enough information to perform some basic analyses, such as timing awaits.

To do this, the awaitPost callback would need to provide information so it could be tied to its corresponding awaitPre callback. There are at least two ways I can think to do this:

  1. providing the original promise in the awaitPost callback (e.g. function awaitPost(iid, originalValAwaited, result, exceptionVal))
  2. providing a unique identifier for the corresponding promise in both awaitPre and awaitPost (e.g. function awaitPre(iid, valAwaited, awaitID) and function awaitPost(iid, result, awaitID, exceptionVal))

mwaldrich avatar Nov 20 '19 00:11 mwaldrich

In Graal.js we can already instrument all Promise creations, so I assume that solution #1 would be easy -- @Haiyang-Sun what do you think?

eleinadani avatar Nov 24 '19 20:11 eleinadani

@mwaldrich It makes sense to add the promise object also in the awaitPost statement. I think you can maintain an id for a promise object in your analysis using the builtin Map because it allows you to use an object as a key.

@eleinadani I think it's not related to the builtins. We can provide the promise object from the inputs. I will patch nodeprof to do it.

Haiyang-Sun avatar Nov 28 '19 14:11 Haiyang-Sun

@mwaldrich actually adding the promise object in awaitPost might not be necessary. Since now awaitPre and awaitPost are always in pair, you can maintain an "AwaitStack".

Haiyang-Sun avatar Nov 29 '19 08:11 Haiyang-Sun

@Haiyang-Sun Even though awaitPre/awaitPost appear in pairs, I don't think they can be tracked in a stack.

I think this is because async/await programs, just like ES5 programs with native promises, have dynamic control flow that doesn't fit into a simple call-stack model. The program can be waiting on multiple independent awaits, and the processing order depends on the event loop.

Here's an example of a program whose awaitPre/awaitPosts do not form a stack:

example.js

  async function printHello() {
      console.log("hello!");
  }

  async function async1() {
      await printHello();
  }

  async function async2() {
      await printHello();
  }

  async1();
  async2();

analysis.js

// DO NOT INSTRUMENT
(function (sandbox) {
    function MyAnalysis() {
            this.awaitPre = function (iid, valAwaited) {
                console.log(`awaitPre iid=${iid} loc=${J$.iidToLocation(iid)} valAwaited=${valAwaited}`);
            };
            this.awaitPost = function (iid, result, exceptionVal) {
                console.log(`awaitPost iid=${iid} loc=${J$.iidToLocation(iid)} result=${result}`);
            };
    }

    sandbox.analysis = new MyAnalysis();
})(J$);

Output

hello!
awaitPre iid=1 loc=(async4.js:6:5:6:23) valAwaited=[object Promise]
hello!
awaitPre iid=2 loc=(async4.js:10:5:10:23) valAwaited=[object Promise]
awaitPost iid=1 loc=(async4.js:6:5:6:23) result=undefined
awaitPost iid=2 loc=(async4.js:10:5:10:23) result=undefined

If you tried to push all promises seen in awaitPre onto a stack, and popped them in awaitPost to infer to the promise, you would infer the wrong promise in this case. This is because any time an async function is called without an await, a new callstack is created, which would have its own separate await stack. This cannot be modeled with a single stack.

Therefore, I think providing the promise in awaitPost is still necessary.

mwaldrich avatar Dec 02 '19 17:12 mwaldrich

@mwaldrich Umm, I know that providing a promise would be helpful. But it is not the key to the problem here. Assume the following example:

async function printHello() {
      console.log("hello!");
  }
let p = printHello();
  async function async1() {
      await p;
  }

  async function async2() {
      await p;
  }

  async1();
  async2();

You will still have the same promise object in the two awaitPost events. I think the key is to use a queue instead of a stack, assuming that all the awaitPre events will be consumed one by one and maps to the next coming awaitPost.

Haiyang-Sun avatar Dec 03 '19 18:12 Haiyang-Sun

@Haiyang-Sun I see what you're saying, but I think neither queues nor stacks will work.

Functions like those in the fs library may take lots of time to evaluate, and they may not be finished after 1 event loop completion. Therefore, they may be skipped in the next event loop run, while other awaits are processed. This skipping would result in awaits not being processed in a queue.

Here's an example of code that (usually) doesn't process its awaits as a queue:

asyncIO.js

This program awaits 2 different asynchronous operations:

  1. stating the current directory
  2. a trivial identity function

Since stating the current directory will usually take longer than evaluating an identity function, the awaits are not processed as a queue.

const util = require('util');
const fs = require('fs');
const statAsync = util.promisify(fs.stat);

async function id(x) {
    return await x;
}

async function longIOOperation() {
    return await statAsync(".");
}

async function main() {
    longIOOperation().then(_ => console.log("done with long i/o operation"));
    id(1).then(_ => console.log("done evaluating id function"));
}

main();
console.log("first event loop over");

analysis.js

// DO NOT INSTRUMENT
(function (sandbox) {
    function MyAnalysis() {
            this.awaitPre = function (iid, valAwaited) {
                console.log(`awaitPre iid=${iid} loc=${J$.iidToLocation(iid)} valAwaited=${valAwaited}`);
            };
            this.awaitPost = function (iid, result, exceptionVal) {
                console.log(`awaitPost iid=${iid} loc=${J$.iidToLocation(iid)} result=${result}`);
            };
    }

    sandbox.analysis = new MyAnalysis();
})(J$);

Analysis output

awaitPre iid=2 loc=(asyncIO.js:10:12:10:32) valAwaited=[object Promise]
first event loop over
awaitPre iid=4 loc=(asyncIO.js:6:12:6:19) valAwaited=1
awaitPost iid=4 loc=(asyncIO.js:6:12:6:19) result=1
done evaluating id function
awaitPost iid=2 loc=(asyncIO.js:10:12:10:32) result=[object Object]
done with long i/o operation

If I assumed all awaits were processed in order, my analysis would incorrectly infer the promise for the first awaitPost.

mwaldrich avatar Dec 03 '19 20:12 mwaldrich

Yeah, you are right. So we might actually need a unique id for the await operation. @eleinadani , if I remember correctly, there is an internal promise object that can be used for this purpose (https://github.com/graalvm/graaljs/blob/master/graal-js/src/com.oracle.truffle.js/src/com/oracle/truffle/js/nodes/control/AwaitNode.java#L169). For example, even for a statement like await 1; there is still some internal promise object created. I will see if I can find a way to expose it as the identifier.

Haiyang-Sun avatar Dec 03 '19 21:12 Haiyang-Sun

Yes, there should be an internal promise and you should be able to instrument that via builtin root nodes instrumentation. Regarding unique ids, I think you can just use System.identityCode() for object instances (e.g. promises), while for code locations you could either use the identity of the source section (should be unique) or implement your own mapping. On a separate note (still realted to async/await) --> your Graaljs bugfix will land probably today

eleinadani avatar Dec 04 '19 16:12 eleinadani

Is there any update on the implementation of the modified awaitPre and awaitPost callbacks?

mwaldrich avatar Dec 18 '19 22:12 mwaldrich

@mwaldrich Sorry for the late reply. I had a look, it is not straightforward to update this feature. I can work on it later this week.

Haiyang-Sun avatar Jan 13 '20 10:01 Haiyang-Sun

@mwaldrich I have a workaround fix (https://github.com/Haiyang-Sun/nodeprof.js/pull/70). In this fix, the promise object will be provided both in awaitPre and in awaitPost. In awaitPost, the exception parameter is replaced with a isRejected flag, and the third argument will be the value resolved or rejected of the awaited promise.

Another special case is to await on a value instead of a promise. You won't get any promise object in the callbacks.

There is still a bug where we can miss the promise object in the awaitPost callback in some corner case, which I will look into.

Haiyang-Sun avatar Jan 21 '20 18:01 Haiyang-Sun