Container image is missing 'unstructured' pip package
Resulting in these errors:
Setting up quick upload event
Running on local URL: http://0.0.0.0:7860
To create a public link, set `share=True` in `launch()`.
User "admin" already exists
Setting up quick upload event
User-id: None, can see public conversations: False
User-id: 1, can see public conversations: True
len(results)=0, len(file_list)=1
len(results)=0, len(file_list)=1
Overriding with default loaders
use_quick_index_mode False
reader_mode default
Using reader <kotaemon.loaders.unstructured_loader.UnstructuredReader object at 0x7f8d1a655e10>
use_quick_index_mode True
reader_mode default
Using reader <kotaemon.loaders.unstructured_loader.UnstructuredReader object at 0x7f8d1a655e10>
No module named 'unstructured'
Traceback (most recent call last):
File "/app/libs/ktem/ktem/index/file/pipelines.py", line 724, in stream
file_id, docs = yield from pipeline.stream(
File "/app/libs/ktem/ktem/index/file/pipelines.py", line 586, in stream
docs = self.loader.load_data(file_path, extra_info=extra_info)
File "/app/libs/kotaemon/kotaemon/loaders/unstructured_loader.py", line 70, in load_data
from unstructured.partition.auto import partition
ModuleNotFoundError: No module named 'unstructured'
/usr/local/lib/python3.10/site-packages/gradio/components/dropdown.py:188: UserWarning:
The value passed into gr.Dropdown() is not in the list of choices. Please update the list of choices to include: None or set allow_custom_value=True.
Thanks for the quick feedback. Default Docker image is missing a lot of file loaders required package like the one you mentioned above. We will provide more feature-rich Docker image in the future with dependencies pre-installed (but image size will increase dramatically).
Same issue for me
After downloading 0.4.1 release same error happens for me using "packaged" version. Im just trying to index .txt file
len(results)=1, len(file_list)=1
use_quick_index_mode False
reader_mode default
Using reader <kotaemon.loaders.unstructured_loader.UnstructuredReader object at 0x000001C8375C7AC0>
No module named 'unstructured'
Traceback (most recent call last):
File "D:\software\kotaemon-app\install_dir\env\lib\site-packages\ktem\index\file\pipelines.py", line 724, in stream
file_id, docs = yield from pipeline.stream(
File "D:\software\kotaemon-app\install_dir\env\lib\site-packages\ktem\index\file\pipelines.py", line 586, in stream
docs = self.loader.load_data(file_path, extra_info=extra_info)
File "D:\software\kotaemon-app\install_dir\env\lib\site-packages\kotaemon\loaders\unstructured_loader.py", line 70, in load_data
from unstructured.partition.auto import partition
ModuleNotFoundError: No module named 'unstructured'
len(results)=1, len(file_list)=1
pip install unstructured
solved it for me, this package dependencie should be in the pyproject.toml
Resolved in the new version. Please check the latest docker full.