graaljs Poor performance when having to "reset" JavaScript bindings scope

Hello,

we're still in the process of migrating our application from Rhino to GraalVM and are now at a place where we can compare our new GraalVM implementation with the old Rhino implementation. Sadly, performance is not as good as we had hoped.

The time to run one "action" has actually increased by a factor of around 2x compared to Rhino, which really isn't acceptable for our use case.

When profiling, the bottleneck quickly becomes apparent: With our current implementation, we're creating a Context once and are then, when we need to reset the bindings scope, setting it with this method:

public static void setJavascriptScope(Map<String, Object> scope) {
    Value bindings = context.getBindings("js");
    for (String member : bindings.getMemberKeys()) {
        if (scope.containsKey(member)) {
            bindings.putMember(member, scope.get(member));
            scope.remove(member);
        } else {
            bindings.removeMember(member);
        }
    }

    for (Map.Entry<String, Object> newMember : scope.entrySet()) {
        bindings.putMember(newMember.getKey(), newMember.getValue());
    }
}

The problem lies with the call to removeMember(), which takes up ~90% of the time within this method. In fact, simply setting the scope takes up ~70% of the time compared to ~30% of time that is spent actually evaluating our JavaScript code!

Now, the usual recommendation is to create a new Context instead of resetting the bindings scope like this, but we've found that this way is actually slightly faster and definitely easier to implement in our application.

Nonetheless, we've tried to create a new Context everytime we need to reset the bindings scope. Sadly, as we had already measured, performance isn't really faster that way. Instead of the bottleneck being the removeMember() method, now it's the creation of the new Context each time, meaning that performance is still ~2x worse.

For background, in our application, we're dealing with a lot (hundreds) of scope objects that need to be changed and set thousands (in some cases a few orders of magnitude more) of times during one "action". The actual JavaScript that is executed on the other hand is relatively basic and simple.

We fear that such a use case of GraalJS has (perhaps?) not yet been on your radar and that therefore, performance in this use case could be improved by optimizing the process of (re)setting the bindings scope.

Feb 21 '22 15:02 erstr2ggl

Hi,

as you already mention, the suggested way to do this is to create a new Context object every time. By using a shared Engine and reusing cached Source objects that case should be fast, faster than manually mingling with the Context object by removing entries.

https://www.graalvm.org/22.0/reference-manual/embed-languages/#code-caching-across-multiple-contexts

If you already follow all the advice there, can you maybe share a reproducer for your problem, exhibiting the bad performance?

Best, Christian

Feb 21 '22 16:02 wirthi

Hello,

creating a simple class that reproduces the problem to it's full extent is hard due to the complexity of our application, but here's a short one that I hope doesn't optimize anything crucial away:

public class Example {

    private long counter = 0;

    public void runGraal() {
        Engine engine = Engine.newBuilder("js").option("js.ecmascript-version", "6")
                .option("js.foreign-object-prototype", "true").build();
        for (int i = 0; i < 5000; ++i) {
            Context context = Context.newBuilder("js").engine(engine).build();
            Value bindings = context.getBindings("js");
            for (int j = 0; j < 250; ++j) {
                bindings.putMember("val" + counter, new SecureRandom().nextInt());
                ++counter;
            }
            System.out.println(context.eval("js", "val" + (counter - new SecureRandom().nextInt(250) - 1) + " + 0").toString());
    }

    public void runRhino() {
        for (int i = 0; i < 5000; ++i) {
            org.mozilla.javascript.Context context = org.mozilla.javascript.Context.enter();
            Scriptable scope = context.initStandardObjects();
            for (int j = 0; j < 250; ++j) {
                scope.put("val" + counter, scope, new SecureRandom().nextInt());
                ++counter;
            }
            System.out.println(context.evaluateString(scope, "val" + (counter - new SecureRandom().nextInt(250) - 1) + " + 0",
                    "js", 0, null).toString());
        }
    }

    public static void main(String[] args) {
        Example ex = new Example();

        // Graal (with warmup)
        ex.runGraal();
        long time = System.currentTimeMillis();
        ex.runGraal();
        double graal = (System.currentTimeMillis() - time) / 1000.;

        // Rhino
        time = System.currentTimeMillis();
        ex.runRhino();
        double rhino = (System.currentTimeMillis() - time) / 1000.;

        System.out.println("Total time Graal: " + graal + "s");
        System.out.println("Total time Rhino: " + rhino + "s");
    }

}

After a few runs, I get on average: Graal: ~27.5s Rhino: ~21.5s

Note that even in medium cases, we'd have to recreate the context about 200.000 times, not just 5.000, which obviously exacerbates the issue by a lot. Also note that this comparison isn't really fair to Rhino, since it actually spends by far the most time actually evaluating and compiling the JavaScript code - instead of creating the context. In our application, this isn't really an issue since we've implemented caching for that. That was actually one of the plus points we've found with Graal, that we could remove our Rhino caching logic since that's already built in.

Feb 22 '22 08:02 erstr2ggl

Hi @erstr2ggl

thanks for your reproducer. I can roughly confirm your numbers. I'd add one more round of warmup up Graal, as that seems to have some influence, but in general, the numbers seem correct.

The performance of putting members is especially slow when doing that in the global scope. I see significantly better numbers - actually in the ballpark of Rhino - when doing the same not on many variables in the global scope, but as members of a variable:

public void runGraalNonGlobal() {
        Engine engine = Engine.create();
        int check=0;
        for (int i = 0; i < 5000; ++i) {
            Context context = Context.newBuilder("js").engine(engine).option("js.ecmascript-version", "6")
                .option("js.foreign-object-prototype", "true").build();
            Value globalBindings = context.getBindings("js");
            context.eval("js","var inner={}");
            Value innerBindings = globalBindings.getMember("inner");
            for (int j = 0; j < 250; ++j) {
                innerBindings.putMember("val" + counter, new SecureRandom().nextInt());
                ++counter;
            }
            String result = context.eval("js", "inner.val" + (counter - new SecureRandom().nextInt(250) - 1) + " + 0").toString();
            check += result.length();
        }
        System.out.println("Check: "+check);
    }

Another, even faster option is to only share one entry from Java to JavaScript, and modify everything else on the Java side. Very similar to that approach above, but having inner as a Java class that has the respective members. I am sharing an int[250] here (note this requires allowHostAccess(HostAccess.ALL)), but if you need more flexibility, you could look into org.graalvm.polyglot.proxy.ProxyArray or ...ProxyObject.

    public void runGraalHostAccess() {
        Engine engine = Engine.create();
        int result=0;
        for (int i = 0; i < N; ++i) {
            Context context = Context.newBuilder("js").engine(engine).option("js.ecmascript-version", "6").option("js.foreign-object-prototype", "true").allowHostAccess(HostAccess.ALL).build();
            Value bindings = context.getBindings("js");
            int[] arr = new int[VARIABLE_COUNT];
            bindings.putMember("array", arr);
            for (int j = 0; j < VARIABLE_COUNT; ++j) {
                arr[j] = new SecureRandom().nextInt(250);
            }
            String code = "array[" + (new SecureRandom().nextInt(250)) + "]";
            int x = context.eval("js", code).asInt();
            result += x;
        }
        System.out.println(result);
    }

On my machine, this variant is by far the fastest, roughly 40% faster than your initial Rhino example (I know Rhino too little, you might be able to do something similar there as well).

Best, Christian

Feb 22 '22 09:02 wirthi

Hello,

thanks for your quick reply and for looking for a solution.

Sadly, creating a single entry in which we put our values isn't really a feasable solution for us, since that would mean that we'd have to adapt all of our JavaScript snippets to access the values via inner.xyz. We're talking millions of small snippets spread across many different systems.

Is there perhaps a way to improve the performance of putting members on the global scope, so that it's similarly fast?

Feb 22 '22 12:02 erstr2ggl

Hello,

we've actually found a solution to access the values of the inner object by surrounding our JS snippets with with (inner) { ... }.

While performance is definitely much better, it is still not quite as fast as our old Rhino implementation.

We still believe that it's a problem GraalJS has with many elements in the scope, as per this example:

public void runGraal() {
    Map<String, Object> scope = new HashMap<>();
    for (int i = 0; i < 1000; ++i) {
        scope.put("val" + i, i);
    }
    Scope s = new Scope(scope);

    try (Context ctx = Context.newBuilder("js").build()) {
        ctx.getBindings("js").putMember("inner", s);
        long time = System.currentTimeMillis();
        for (int i = 0; i < 5000; ++i) {
            System.out.println("out: " + ctx.eval("js", "with (inner) { Math.PI * val1 }").toString());
        }
        System.out.println("Time taken a: " + (System.currentTimeMillis() - time) / 1000. + "s");
    }
}

public void runRhino() {
    org.mozilla.javascript.Context ctx = org.mozilla.javascript.Context.enter();
    Scriptable scope = ctx.initStandardObjects();
    for (int i = 0; i < 1000; ++i) {
        scope.put("val" + i, scope, i);
    }

    long time = System.currentTimeMillis();
    Script exec = ctx.compileString("Math.PI * val1", "js", 0, null);
    for (int j = 0; j < 5000; ++j) {
        System.out.println("out: " + exec.exec(ctx, scope).toString());
    }
    System.out.println("Time taken: " + (System.currentTimeMillis() - time) / 1000. + "s");
    ctx.close();
}

Scope in this case is a simple class that implements ProxyObject and simply returns the values from the Map.

Results with two warmup rounds for GraalJS: Graal: ~0,35s Rhino: ~0.16s

Feb 23 '22 14:02 erstr2ggl

Hi,

I can confirm we are around 2x slower on that example. I don't think there is a whole lot we can do about that.

The strength of our compiler emerges when the application executes a realistic amount of JavaScript code, including arithmetics, function calling, etc. Your benchmark just sets values on the Java side and reads them on the JavaScript side. Nothing we can optimize there, because ....
... our API between Java and JavaScript is a bit more complex than Rhino's. Context supports many different programming languages (JavaScript, Ruby, Python, R, LLVM, Java itself, ...), while Rhino's API just supports one: JavaScript.
Our assumption is that, while on this specific benchmark we are slower than Rhino, for any realistic application executing a larger amount of JavaScript code, we are at least on par, if not faster.
If that assumption is wrong, i.e. your benchmark really represents the gist of your application, then we might not be able to optimize that specific case as well as Rhino, true.

For background, in our application, we're dealing with a lot (hundreds) of scope objects that need to be changed and set thousands (in some cases a few orders of magnitude more) of times during one "action". The actual JavaScript that is executed on the other hand is relatively basic and simple.

Right, this might just not be a good usecase for our approach. We try to deliver good enough performance, and a 2x drop on that specific microbenchmark is obviously relevant for you, but also far from catastrophic - for something we consider a corner-case.

As stated above, we support more languages and thus have a certain impact on the boundary. The GraalVM compiler is able to mitigate most of that in the real-world application we have seen so far. An application/benchmark that almost exclusively exercises that boundary might see some impact compared to other approaches that don't need any logic on the boundary.

We fear that such a use case of GraalJS has (perhaps?) not yet been on your radar and that therefore, performance in this use case could be improved by optimizing the process of (re)setting the bindings scope.

Our way of doing this is to create a new Context. We are not convinced that providing a feature to reset the binding scope on a Context is reasonable (given we consider this a corner-case), or even technically feasible in general without severe implications on many of the assumptions and invariants we currently have on a Context and Value lifecycles.

Best, Christian

Mar 02 '22 09:03 wirthi

graaljs graaljs copied to clipboard

Poor performance when having to "reset" JavaScript bindings scope

graaljs
graaljs copied to clipboard