BentoML icon indicating copy to clipboard operation
BentoML copied to clipboard

feature: ONNX service with multiple inputs

Open wyhanz opened this issue 3 years ago • 5 comments

Feature request

I exported an onnx model that accept multiple inputs ("input_ids", "input_mask", "input_seg").

And the docs for bentoml(ONNX) only give a simple example, runner.run.run(test_input). However, an error ouccred after i wrapped these inputs with dict:
TypeError: run of ONNXRunnable only takes numpy.ndarray or pd.DataFrame, tf.Tensor, or torch.Tensor as input parameters

I wonder if i use it in a wrong way?

Motivation

I think ONNX is useful for AI inference so can you improve the docs about using bentoml on onnx service?

Other

No response

wyhanz avatar Oct 10 '22 11:10 wyhanz

After reading the source code, i found that bentoml may convert my input to np.float32. But my onnx model takes int64 as its input. It may be a bug?

wyhanz avatar Oct 12 '22 02:10 wyhanz

def inf(inputs):
    inputs = json.loads(inputs)
    inputs['input_ids'] = np.array(inputs['input_ids'], dtype = np.float32)
    inputs['input_mask'] = np.array(inputs['input_mask'], dtype = np.float32)
    inputs['input_seg'] = np.array(inputs['input_seg'], dtype = np.float32)

    print(np.shape(inputs['input_ids']), np.shape(inputs['input_mask']), np.shape(inputs['input_seg']))
    outputs = onnx_runner.run.run(
                     inputs['input_ids'].astype(np.int64), 
                     inputs['input_mask'].astype(np.int64),
                     inputs['input_seg'].astype(np.int64))

    print('---------------------------------')
    print(outputs)
    return outputs

This code raise error: onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Unexpected input data type. Actual: (tensor(float)) , expected: (tensor(int64)). But I explicitly converted variable types to int64......

wyhanz avatar Oct 12 '22 02:10 wyhanz

@wyzhangyuhan This is a limitation for current onnx implementation (we are converting all inputs into float32). We will improve this by infer the input type from the onnx model

larme avatar Oct 12 '22 02:10 larme

Hi @larme +1 for the resolution of this problem as it actually blocks the use of ONNX Transformers with bentoML.

Matthieu-Tinycoaching avatar Oct 19 '22 09:10 Matthieu-Tinycoaching

It's encouraged to use the new BentoML service, can you try the latest bentoml version and see if it works for you?

frostming avatar Jul 11 '24 01:07 frostming