dianna icon indicating copy to clipboard operation
dianna copied to clipboard

Lime text fails in case of multiple sentences separated by punctuation

Open laurasootes opened this issue 1 year ago • 3 comments

Lime text fails in case of multiple text parts separated by punctuation. For example on

"review with???! review with???!"

(also on a sentence like: The movie started great, but the ending is boring and unoriginal due to the comma in the middle)

A sentence ending on punctuation symbol(s) does work, this was covered in the previous fix

It returns an error: Could not create tensor from given input list

laurasootes avatar Mar 21 '23 12:03 laurasootes

So far I traced it back to https://github.com/dianna-ai/dianna/blob/main/dianna/utils/onnx_runner.py it fails on pred_onnx = sess.run([output_name], onnx_input)[0]

laurasootes avatar Mar 21 '23 14:03 laurasootes

It seems to me that it fails on all punctuation as long as there is no space inbetween.

i.e. this fails: The movie started great, but the ending is boring and unoriginal.

This works: The movie started great , but the ending is boring and unoriginal .

stefsmeets avatar Apr 06 '23 09:04 stefsmeets

Behaviour like this was already found by @loostrum in #437. A fix was implemented in #462. However, maybe the fix did not fix the whole problem?

I did notice before that the previously implemented tests did not cover all cases that we encountered now, so I did already added some additional tests in this branch.

laurasootes avatar Apr 06 '23 09:04 laurasootes