[Bug] LongT5 Error: failed to call OrtRun(). error code = 6 with short inputs
I get this error when using a longt5 model with less than ~30 token input.
[INFO:CONSOLE(34)] "D:/a/_work/1/s/onnxruntime/core/providers/cpu/tensor/reshape_helper.h:26 onnxruntime::ReshapeHelper::ReshapeHelper(const TensorShape &, TensorShapeVector &, bool) i < input_shape.NumDimensions() was false. The dimension with value zero exceeds the dimension size of the input tensor.", source: https://cdn.jsdelivr.net/npm/@xenova/transformers@latest (34)
[INFO:CONSOLE(70)] "An error occurred during model execution: "Error: failed to call OrtRun(). error code = 6.".", source: https://cdn.jsdelivr.net/npm/@xenova/transformers@latest (70)
[INFO:CONSOLE(70)] "Inputs given to model: [object Object]", source: https://cdn.jsdelivr.net/npm/@xenova/transformers@latest (70)
[INFO:CONSOLE(34)] "Uncaught (in promise) Error: failed to call OrtRun(). error code = 6.", source: https://cdn.jsdelivr.net/npm/@xenova/transformers@latest (34)
Hi there. Can you please provide information about your system as well as which model you are trying to run?
I am trying to run google/long-t5-local-base but this happens with all longt5 varients I have tested. I am running this in a webview inside an Android app
Error code = 6 usually is for out of memory issues, and you running in an android browser makes it quite likely that this is the case. The model is ~225M parameters, which although it on the small side, might be too large to run on a phone. Do you have the same problem if you run from a laptop/PC?
The model works with larger input sizes on the phone. Only if the input is less than about 30 tokens. I can do 1000 tokens and it works perfectly but for some reason, a small number of tokens gives the error. I tried manually padding the input tokens but I get the same error any idea as to why?
Hey @xenova ! I'm facing the same issue as @naveengovind . Short inputs are failing on me and not sure why. Would love any help you could possibly provide.
The model works with larger input sizes on the phone. Only if the input is less than about 30 tokens. I can do 1000 tokens and it works perfectly but for some reason, a small number of tokens gives the error.
Oh wait - I think I completely misread your first message, my apologies! I thought it was breaking for >30 tokens, not the other way around.
Based on the description of the problem and the error messages, I would assume that there is sone config value set to 32 which specifies a minimum length (most likely for some sliding window or local attention operation; maybe this).
I tried manually padding the input tokens but I get the same error any idea as to why?
Manually padding should fix this though, and then just setting the attention mask to 0 for the remaining tokens. Let me do some tests
sounds good! thank you very much. On the computer / gpu it works fine , possibly because of padding?
Hey @naveengovind did you manage to get a fix yet? I am facing the exact same issue right now for my usecase. @xenova Do you have any further inputs on the root cause?
@xenova I get a very similar issue when running a long-t5 model on node-js on my laptop as well.
resolve(__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").run(feeds, fetches, options));
^
Error: Non-zero status code returned while running Reshape node. Name:'/block.0/layer.0/TransientGlobalSelfAttention/Reshape_25' Status Message: /Users/runner/work/1/s/onnxruntime/core/providers/cpu/tensor/reshape_helper.h:26 onnxruntime::ReshapeHelper::ReshapeHelper(const onnxruntime::TensorShape &, onnxruntime::TensorShapeVector &, bool) i < input_shape.NumDimensions() was false. The dimension with value zero exceeds the dimension size of the input tensor.
Related issue in transformers: https://github.com/huggingface/transformers/issues/18243. cc @fxmarty as I think this is due to an issue with the Optimum exporting process.