unstructured
unstructured copied to clipboard
tables to dataframe
how to convert tables into dataframe ?
seems it is not the function of unstructured, but, as unstructured also use table transformer project, you can refer microsoft table transformer project, or this article: https://medium.com/@lidores98/image-table-to-dataframe-using-python-ocr-773c8afb713d Hope this can help
If you are working with unstructured output, a Table
element has a metadata.text_as_html
field which you could read into a pandas dataframe (google "html to pandas dataframe").
If you are working with unstructured output, a
Table
element has ametadata.text_as_html
field which you could read into a pandas dataframe (google "html to pandas dataframe").
i am receiving a KeyError while trying to access text_as_html. Can you please provide the code or any help regarding this?
@shriharshan please open a new issue for your problem as it is not strictly related to the original post.
Provide a snippet of how you call the partitioning function and the full stack trace you receive when you get the KeyError. Also, make sure you're using the latest version of unstructured
.
An ElementMetadata
object will never raise KeyError
on accessing element.metadata.text_as_html
. However accessing the dict
form of an element could. You'll need to use element["metadata"].get("text_as_html")
in that case and properly handle the possible None
case.
Closing original post as resolved.