unstructured icon indicating copy to clipboard operation
unstructured copied to clipboard

bug/partition_text remove the minus sign

Open IvanLuLyf opened this issue 1 year ago • 4 comments

Describe the bug function partition_text remove the minus sign.

To Reproduce

from unstructured.partition.text import partition_text

text = '''
net amount
-4,391,082,054.12

rate is -10%
'''

print(text)

data = partition_text(text=text)

for d in data:
    print(d.text)
    print('-' * 10)

Expected behavior

net amount
-4,391,082,054.12

rate is -10%

net amount
----------
-4,391,082,054.12
----------
rate is
----------
-10%
----------

Screenshots image

IvanLuLyf avatar Mar 21 '24 08:03 IvanLuLyf

It seems to be because the symbol is recognized as a bullet for an unordered list.

ProgramSalamander avatar Mar 21 '24 09:03 ProgramSalamander

We'll get this fixed as soon as we're able

MthwRobinson avatar May 24 '24 14:05 MthwRobinson

Is anyone working on fixing this?

longzusuper avatar Jul 26 '24 22:07 longzusuper