unstructured
unstructured copied to clipboard
TypeError: ElementMetadata.__init__() got an unexpected keyword argument 'key3'
In trying to load a JSON file (structured as below) with a call to elements = partition(filename=f)
, I get the error message in the title.
[
{
"key1": "val1",
"key2":
[
"val2"
],
"metadata":
{
"key3": "val3",
}
}
]
Upon digging a bit into the Unstructured code, I figured that while the JSON file loads fine, the conversion of the loaded dict to elements fails at the line below, because the code is parsing the 'metadata' in the input file as metadata about the document, but in fact this element refers to my use case specific metadata which I'd like to keep as part of the document text. So perhaps this looks like a conflict. Is there a way to avoid this in the unstructured library?
https://github.com/Unstructured-IO/unstructured/blob/5defe79bf24d503b8ad6ed6de1a69f20c7cec47b/unstructured/staging/base.py#L134
@shiralkarprashant - Could you sent the code block to reproduce the error? I gave partition
a try with the example dictionary and got []
as the output instead of the error. Either way, []
is probably not what we want here, and we can fall back to processing as text until we have a more sophisticated JSON parser.