nodeprof.js
nodeprof.js copied to clipboard
Callbacks for async/await
Would it be possible to implement new callbacks to instrument async
and await
?
For async
, perhaps an argument could be added to the function call and function literal callbacks to indicate if it is an async
function.
For await
, I would suggest:
- adding an
isAwaited
argument to the function call callbacks, indicating that the caller is awaiting this function's result. - a new
await
callback. This would allow the analysis to instrument when theawait
is scheduled.
These three changes together will allow analysis to distinguish between:
- a regular function being invoked sequentially;
- an async function being invoked sequentially (the value is not being awaited);
- an async function being awaited.
Thanks!
@mwaldrich Hi, thanks for the suggestion here.
For await
, I don't think it is a good idea to bind it with a function call. await
can be used on any value or promise, and await on an asynchronous function is actually await the internal promise returned by the async function (an async function always return a promise object, even though the function itself may return some non-promise value). So what I propose is to add two callback pairs like below:
For await
awaitPre (iid, val) => where val could be any non-promise value or a promise object.
awaitPost(iid, ret) => where ret will be the actually value resolved by the awaited promise or value.
For async
, we can provide a special asyncFunctionExit event, which provide an extra argument of the promise object returned by the asynchronous function call:
asyncFunctionExit(iid, ret, promise, exceptionWrapper)
With these two callbacks, you should be able to know if an async function has been awaited by tracking if any await is waiting for the same promiseObject returned by the async function.
I had some experimental implementation about the callbacks as above. However, it was based on my own hack on an old version of Graal.js. We will look into how to properly integrate it in Graal.js in future.
Feel free to discuss here what you think about these potential new callbacks.
Sounds great! This should be enough to perform general instrumentation on async/await in JS.
How much work will it be to clean up the hacky implementation and get it merged into Graal?
Any update on these callbacks?
Async support has been merged upstream thanks to @eleinadani, await is still in progress.
Thanks for the implementation! I've started playing around with these callbacks, but they don't seem to be called in the way I would expect.
I actually have two concerns:
- if I understand the purpose of the currently-implemented callbacks correctly, they are not being called at the right times
- I'm not sure the currently-implemented callbacks actually give us enough information.
Example (timing how long each await is)
Imagine you want to write a simple analysis to time how long each await is. You want to record a timestamp when a value is awaited, and another when the await is resolved.
Here's a simple analysis and sample program.
asyncAnalysis.js
(analysis)
//DO NOT INSTRUMENT
(function (sandbox) {
function MyAnalysis() {
// this.promises = new Map();
this.functionEnter = function (iid, f, base, args) {
console.log("functionEnter@ " + J$.iidToLocation(iid), f.name, args.length);
};
this.functionExit = function(iid, returnVal, wrappedExceptionVal) {
console.log("functionExit@ " + J$.iidToLocation(iid));
};
this.asyncFunctionEnter = function(iid) {
console.log('async enter', J$.iidToLocation(iid));
}
this.asyncFunctionExit = function(iid, returnVal, wrappedException) {
console.log('async exit', J$.iidToLocation(iid), returnVal, wrappedException);
}
this.awaitPre = function(iid, valAwaited) {
console.log('awaitPre', J$.iidToLocation(iid), valAwaited);
}
this.awaitPost = function(iid, result, exceptionWrapper) {
console.log('awaitPost', J$.iidToLocation(iid), result, exceptionWrapper);
}
this.binary = function(iid, op, left, right) {
console.log('binary', op, left, right);
}
this.write = function(iid, name, val) {
console.log('write', J$.iidToLocation(iid), name, val);
}
this._return = function(iid, val) {
console.log('return', J$.iidToLocation(iid), val);
}
this.declare = function(iid, name, type) {
console.log('declare', J$.iidToLocation(iid), name, type);
};
}
sandbox.analysis = new MyAnalysis();
})(J$);
async.js
(sample program)
async function countTo6(x) {
if (x > 5) {
return x;
} else {
return await countTo6(x + 1);
}
}
async function main() {
return await countTo6(1);
}
main().then(v => console.log("done! answer = " + v));
Analysis output
functionEnter@ (async.js:1:1:16:4) 0
functionExit@ (async.js:1:1:16:4)
declare (async.js:1:2:16:2) exports var
declare (async.js:1:2:16:2) require var
declare (async.js:1:2:16:2) module var
declare (async.js:1:2:16:2) __filename var
declare (async.js:1:2:16:2) __dirname var
declare (async.js:1:2:16:2) countTo6 var
declare (async.js:1:2:16:2) main var
functionEnter@ (async.js:1:2:16:2) 5
write (async.js:2:1:8:2) countTo6 async function countTo6(x) {
if (x > 5) {
return x;
} else {
return await countTo6(x + 1);
}
}
write (async.js:10:1:12:2) main async function main() {
return await countTo6(1);
}
async enter (async.js:11:4:11:29)
functionEnter@ (async.js:10:1:12:2) main 0
declare (async.js:2:1:8:2) x var
binary > 1 5
binary + 1 1
declare (async.js:2:1:8:2) x var
binary > 2 5
binary + 2 1
declare (async.js:2:1:8:2) x var
binary > 3 5
binary + 3 1
declare (async.js:2:1:8:2) x var
binary > 4 5
binary + 4 1
declare (async.js:2:1:8:2) x var
binary > 5 5
binary + 5 1
declare (async.js:2:1:8:2) x var
binary > 6 5
return (async.js:4:8:4:17) 6
awaitPre (async.js:6:16:6:37) Promise { 6 }
awaitPre (async.js:6:16:6:37) Promise { <pending> }
awaitPre (async.js:6:16:6:37) Promise { <pending> }
awaitPre (async.js:6:16:6:37) Promise { <pending> }
awaitPre (async.js:6:16:6:37) Promise { <pending> }
awaitPre (async.js:11:11:11:28) Promise { <pending> }
functionExit@ (async.js:10:1:12:2)
async exit (async.js:11:4:11:29) Promise { <pending> } {}
functionExit@ (async.js:1:2:16:2)
awaitPre (async.js:6:16:6:37) Promise { 6 }
awaitPost (async.js:6:16:6:37) 6 {}
return (async.js:6:9:6:38) 6
awaitPre (async.js:6:16:6:37) Promise { 6 }
awaitPost (async.js:6:16:6:37) 6 {}
return (async.js:6:9:6:38) 6
awaitPre (async.js:6:16:6:37) Promise { 6 }
awaitPost (async.js:6:16:6:37) 6 {}
return (async.js:6:9:6:38) 6
awaitPre (async.js:6:16:6:37) Promise { 6 }
awaitPost (async.js:6:16:6:37) 6 {}
return (async.js:6:9:6:38) 6
awaitPre (async.js:6:16:6:37) Promise { 6 }
awaitPost (async.js:6:16:6:37) 6 {}
return (async.js:6:9:6:38) 6
functionEnter@ (async.js:10:1:12:2) main 0
awaitPre (async.js:11:11:11:28) Promise { 6 }
awaitPost (async.js:11:11:11:28) 6 {}
return (async.js:11:4:11:29) 6
functionExit@ (async.js:10:1:12:2)
declare (async.js:14:13:14:52) v var
functionEnter@ (async.js:14:13:14:52) 1
binary + done! answer = 6
done! answer = 6
return (async.js:14:18:14:52) undefined
functionExit@ (async.js:14:13:14:52)
Issues
There are a few weird things happening in the analysis:
-
functionEnter
events are not being generated for recursive async calls -
asyncFunctionExit
events are not being generated for recursive async calls -
functionExit
events are not being generated after returning a real values from recursive async calls
In addition, I don't think it is possible to model the call stack from the analysis' perspective without also having an asyncFunctionEnter
callback, which would be used when an async function is "resumed".
I will look into the issue, thanks for the feedback!
Also, a few more things I noticed:
- there are too many
awaitPre
events being generated. ideally, there would be a one-to-one correspondence betweenawaitPre
andawaitPost
events, but there are exactly twice as manyawaitPre
s compared toawaitPost
s. this is because an additionalawaitPre
is generated when theawait
is resolved - I don't believe it's possible to tie together the
awaitPre
s andawaitPost
s unless:-
awaitPost
also provides the original promise thatawaitPre
provided - NodeProf ties a unique ID to each promise, and this ID is given in both
awaitPre
andawaitPost
-
I have tried using the new asyncawait-fix
branch, and it seems to solve the issue of too many awaitPre
events being generated.
However, the awaitPost
callback still does not provide enough information to perform some basic analyses, such as timing awaits.
To do this, the awaitPost
callback would need to provide information so it could be tied to its corresponding awaitPre
callback. There are at least two ways I can think to do this:
- providing the original promise in the
awaitPost
callback (e.g.function awaitPost(iid, originalValAwaited, result, exceptionVal)
) - providing a unique identifier for the corresponding promise in both
awaitPre
andawaitPost
(e.g.function awaitPre(iid, valAwaited, awaitID)
andfunction awaitPost(iid, result, awaitID, exceptionVal)
)
In Graal.js we can already instrument all Promise
creations, so I assume that solution #1 would be easy -- @Haiyang-Sun what do you think?
@mwaldrich It makes sense to add the promise object also in the awaitPost statement. I think you can maintain an id for a promise object in your analysis using the builtin Map
because it allows you to use an object as a key.
@eleinadani I think it's not related to the builtins. We can provide the promise object from the inputs. I will patch nodeprof to do it.
@mwaldrich actually adding the promise object in awaitPost might not be necessary. Since now awaitPre and awaitPost
are always in pair, you can maintain an "AwaitStack".
@Haiyang-Sun Even though awaitPre/awaitPost appear in pairs, I don't think they can be tracked in a stack.
I think this is because async/await programs, just like ES5 programs with native promises, have dynamic control flow that doesn't fit into a simple call-stack model. The program can be waiting on multiple independent awaits, and the processing order depends on the event loop.
Here's an example of a program whose awaitPre/awaitPosts do not form a stack:
example.js
async function printHello() {
console.log("hello!");
}
async function async1() {
await printHello();
}
async function async2() {
await printHello();
}
async1();
async2();
analysis.js
// DO NOT INSTRUMENT
(function (sandbox) {
function MyAnalysis() {
this.awaitPre = function (iid, valAwaited) {
console.log(`awaitPre iid=${iid} loc=${J$.iidToLocation(iid)} valAwaited=${valAwaited}`);
};
this.awaitPost = function (iid, result, exceptionVal) {
console.log(`awaitPost iid=${iid} loc=${J$.iidToLocation(iid)} result=${result}`);
};
}
sandbox.analysis = new MyAnalysis();
})(J$);
Output
hello!
awaitPre iid=1 loc=(async4.js:6:5:6:23) valAwaited=[object Promise]
hello!
awaitPre iid=2 loc=(async4.js:10:5:10:23) valAwaited=[object Promise]
awaitPost iid=1 loc=(async4.js:6:5:6:23) result=undefined
awaitPost iid=2 loc=(async4.js:10:5:10:23) result=undefined
If you tried to push all promises seen in awaitPre onto a stack, and popped them in awaitPost to infer to the promise, you would infer the wrong promise in this case. This is because any time an async function is called without an await, a new callstack is created, which would have its own separate await stack. This cannot be modeled with a single stack.
Therefore, I think providing the promise in awaitPost is still necessary.
@mwaldrich Umm, I know that providing a promise would be helpful. But it is not the key to the problem here. Assume the following example:
async function printHello() {
console.log("hello!");
}
let p = printHello();
async function async1() {
await p;
}
async function async2() {
await p;
}
async1();
async2();
You will still have the same promise object in the two awaitPost events. I think the key is to use a queue instead of a stack, assuming that all the awaitPre events will be consumed one by one and maps to the next coming awaitPost.
@Haiyang-Sun I see what you're saying, but I think neither queues nor stacks will work.
Functions like those in the fs
library may take lots of time to evaluate, and they may not be finished after 1 event loop completion. Therefore, they may be skipped in the next event loop run, while other awaits are processed. This skipping would result in awaits not being processed in a queue.
Here's an example of code that (usually) doesn't process its awaits as a queue:
asyncIO.js
This program awaits 2 different asynchronous operations:
-
stat
ing the current directory - a trivial identity function
Since stat
ing the current directory will usually take longer than evaluating an identity function, the awaits are not processed as a queue.
const util = require('util');
const fs = require('fs');
const statAsync = util.promisify(fs.stat);
async function id(x) {
return await x;
}
async function longIOOperation() {
return await statAsync(".");
}
async function main() {
longIOOperation().then(_ => console.log("done with long i/o operation"));
id(1).then(_ => console.log("done evaluating id function"));
}
main();
console.log("first event loop over");
analysis.js
// DO NOT INSTRUMENT
(function (sandbox) {
function MyAnalysis() {
this.awaitPre = function (iid, valAwaited) {
console.log(`awaitPre iid=${iid} loc=${J$.iidToLocation(iid)} valAwaited=${valAwaited}`);
};
this.awaitPost = function (iid, result, exceptionVal) {
console.log(`awaitPost iid=${iid} loc=${J$.iidToLocation(iid)} result=${result}`);
};
}
sandbox.analysis = new MyAnalysis();
})(J$);
Analysis output
awaitPre iid=2 loc=(asyncIO.js:10:12:10:32) valAwaited=[object Promise]
first event loop over
awaitPre iid=4 loc=(asyncIO.js:6:12:6:19) valAwaited=1
awaitPost iid=4 loc=(asyncIO.js:6:12:6:19) result=1
done evaluating id function
awaitPost iid=2 loc=(asyncIO.js:10:12:10:32) result=[object Object]
done with long i/o operation
If I assumed all awaits were processed in order, my analysis would incorrectly infer the promise for the first awaitPost
.
Yeah, you are right. So we might actually need a unique id for the await operation.
@eleinadani , if I remember correctly, there is an internal promise object that can be used for this purpose (https://github.com/graalvm/graaljs/blob/master/graal-js/src/com.oracle.truffle.js/src/com/oracle/truffle/js/nodes/control/AwaitNode.java#L169). For example, even for a statement like await 1;
there is still some internal promise object created.
I will see if I can find a way to expose it as the identifier.
Yes, there should be an internal promise and you should be able to instrument that via builtin root nodes instrumentation. Regarding unique ids, I think you can just use System.identityCode()
for object instances (e.g. promises), while for code locations you could either use the identity of the source section (should be unique) or implement your own mapping. On a separate note (still realted to async/await) --> your Graaljs bugfix will land probably today
Is there any update on the implementation of the modified awaitPre
and awaitPost
callbacks?
@mwaldrich Sorry for the late reply. I had a look, it is not straightforward to update this feature. I can work on it later this week.
@mwaldrich I have a workaround fix (https://github.com/Haiyang-Sun/nodeprof.js/pull/70).
In this fix, the promise object will be provided both in awaitPre
and in awaitPost
.
In awaitPost
, the exception parameter is replaced with a isRejected
flag, and the third argument will be the value resolved or rejected of the awaited promise.
Another special case is to await on a value instead of a promise. You won't get any promise object in the callbacks.
There is still a bug where we can miss the promise object in the awaitPost
callback in some corner case, which I will look into.