langchain icon indicating copy to clipboard operation
langchain copied to clipboard

feat: add loader for open office odt files

Open MthwRobinson opened this issue 2 years ago • 2 comments

ODF File Loader

Adds a data loader for handling Open Office ODT files. Requires unstructured>=0.6.3.

Testing

The following should work using the fake.odt example doc from the unstructured repo.

from langchain.document_loaders import UnstructuredODTLoader

loader = UnstructuredODTLoader(file_path="fake.odt", mode="elements")
loader.load()

loader = UnstructuredODTLoader(file_path="fake.odt", mode="single")
loader.load()

MthwRobinson avatar May 09 '23 15:05 MthwRobinson

Is it possible to add tests in tests/integration_tests/document_loaders/test_odt.py and an example in the Jupyter notebook?

leo-gan avatar May 09 '23 16:05 leo-gan

@leo-gan - Sure thing! Just added.

MthwRobinson avatar May 09 '23 17:05 MthwRobinson