vespa icon indicating copy to clipboard operation
vespa copied to clipboard

Only evaluate ONNX models once in stateless model eval

Open lesters opened this issue 2 years ago • 12 comments

@bratseth Draft PR. For discussion. Ref https://github.com/vespa-engine/vespa/issues/19901.

To avoid breaking existing API, this adds a new MultiFunctionEvaluator which ensures that ONNX models are only evaluated once for all outputs. Example use:

        MultiFunctionEvaluator eval = modelsEvaluator.multiEvaluatorOf("mul");  // contains two outputs
        Map<String, Tensor> out = eval.bind("input1", input1).bind("input2", input2).evaluate();
        assertEquals(6.0, out.get("output1").sum().asDouble(), 1e-9);
        assertEquals(5.0, out.get("output2").sum().asDouble(), 1e-9);

Especially the name MultiFunctionEvaluator is up for discussion.

lesters avatar Nov 22 '21 14:11 lesters

Whether it is "multi" is a property of the function so it seems wrong to make the caller choose?

Since a FunctionEvaluator is a single-use object with a life-cycle how about storing all outputs on it instead, so you could do

FunctionEvaluator eval = modelsEvaluator.evaluatorOf("mul");  // contains two outputs
eval.bind("input1", input1).bind("input2", input2).evaluate();
assertEquals(6.0, eval.result("output1").sum().asDouble(), 1e-9);
assertEquals(5.0, eval.result("output2").sum().asDouble(), 1e-9);

We could just return null from evaluate if there is no default output, so that the return from evaluate is just "returns the default output, named the same as the function, if any".

bratseth avatar Nov 23 '21 07:11 bratseth

So, now, when calling:

FunctionEvaluator eval = modelsEvaluator.evaluatorOf("mul");  // contains two outputs

The FunctionEvaluator can contain multiple functions. When this is evaluated, it will evaluate all functions and add to the results so they can be retrieved independently, like your code suggestion above. Most (non-ONNX) models will contain only a single function, this will be returned as the result of evaluate. We don't currently add the output to the contexts, as more work would be required to solve which function to evaluate first etc for this to be useable.

For ONNX models that have multiple outputs, the FunctionEvaluater will contain a function for each. However, before they are evaluated, we scan through the context(s) and evaluate each ONNX model once, in evaluateOnnxModels. Those outputs are added to the contexts, ensuring any model is only evaluated once.

lesters avatar Nov 30 '21 10:11 lesters

Yes, so you have a goal of doing the same also for regular functions?

bratseth avatar Nov 30 '21 10:11 bratseth

I hadn't planned on this for this iteration, but this might be something we could consider when/if the need arises?

lesters avatar Nov 30 '21 10:11 lesters

But why support multiple regular functions within one evaluator then?

bratseth avatar Nov 30 '21 10:11 bratseth

Because each output of a ONNX model output has their own ExpressionFunction, like this: onnx(model_name).output_name. When this is evaluated, this value is already inserted into the context from the ONNX model evaluation.

lesters avatar Nov 30 '21 11:11 lesters

Got it. But should we capture this in the model instead of creating an evaluator having (and always evaluating) every function of the model?

We could add a List of outputs to ExpressionFunction?

bratseth avatar Nov 30 '21 15:11 bratseth

By model here you mean the ONNX model (model is such an overloaded term atm)? I don't think the overhead of always evaluating the entire model is practically that large: we do that in Proton today. However, we could do something like:

FunctionEvaluator eval = modelsEvaluator.evaluatorOf("mul", List.of("output1", "output2));

To specify exactly which outputs one would like. If no outputs are passed, all outputs are calculated.

This might make the evaluatorOf function slightly confusing as it has a String array as argument today, and passing "mul", "output1", "output2" is rewritten to "mul", "output1.output2" due to old support for signatures.

lesters avatar Dec 02 '21 21:12 lesters

By model I mean the class owning these functions, ai.vespa.models.evaluation.Model. At line 185 in that you pass the entire list of functions to the FunctionEvaluator constructor if no name is given, and therefore all the functions in the model are always evaluated when that evaluator is used.

However, this is just to achieve the effect of getting all the outputs of a OnnxModel, because each one is represented as a separate ExpressionFunction, as a workaround for ExpressionFunction not supporting multiple outputs. So, I suggest it might be better to add that capability to ExpressionFunction such that we could have an isomorphic mapping between the concepts of Onnx models and our function representation (and for that matter, be able to represent rank features accurately as functions).

bratseth avatar Dec 03 '21 08:12 bratseth

Rank features having multiple outputs is very close to an anti-feature. There might be use cases that might make sense, but I think is better to start off with simple ones. Most rank features use in first phase producing multiple outputs has gotten individual features computing only what is necessary.

baldersheim avatar Dec 03 '21 09:12 baldersheim

Notwithstanding different opinions on that, it's a feature we have and will continue to have. (The performance consideration cuts both ways, some times you solve computational problem once and that produces multiple values.)

bratseth avatar Dec 03 '21 09:12 bratseth

@lesters @bratseth Integrate or close ?

aressem avatar Oct 31 '23 11:10 aressem