kotaemon icon indicating copy to clipboard operation
kotaemon copied to clipboard

Container image is missing 'unstructured' pip package

Open sammcj opened this issue 1 year ago • 4 comments

Resulting in these errors:

Setting up quick upload event
Running on local URL:  http://0.0.0.0:7860

To create a public link, set `share=True` in `launch()`.
User "admin" already exists
Setting up quick upload event
User-id: None, can see public conversations: False
User-id: 1, can see public conversations: True
len(results)=0, len(file_list)=1
len(results)=0, len(file_list)=1
Overriding with default loaders
use_quick_index_mode False
reader_mode default
Using reader <kotaemon.loaders.unstructured_loader.UnstructuredReader object at 0x7f8d1a655e10>
use_quick_index_mode True
reader_mode default
Using reader <kotaemon.loaders.unstructured_loader.UnstructuredReader object at 0x7f8d1a655e10>
No module named 'unstructured'
Traceback (most recent call last):
  File "/app/libs/ktem/ktem/index/file/pipelines.py", line 724, in stream
    file_id, docs = yield from pipeline.stream(
  File "/app/libs/ktem/ktem/index/file/pipelines.py", line 586, in stream
    docs = self.loader.load_data(file_path, extra_info=extra_info)
  File "/app/libs/kotaemon/kotaemon/loaders/unstructured_loader.py", line 70, in load_data
    from unstructured.partition.auto import partition
ModuleNotFoundError: No module named 'unstructured'
/usr/local/lib/python3.10/site-packages/gradio/components/dropdown.py:188: UserWarning:

The value passed into gr.Dropdown() is not in the list of choices. Please update the list of choices to include: None or set allow_custom_value=True.

sammcj avatar Aug 27 '24 04:08 sammcj

Thanks for the quick feedback. Default Docker image is missing a lot of file loaders required package like the one you mentioned above. We will provide more feature-rich Docker image in the future with dependencies pre-installed (but image size will increase dramatically).

taprosoft avatar Aug 27 '24 04:08 taprosoft

Same issue for me

drdsgvo avatar Aug 28 '24 16:08 drdsgvo

After downloading 0.4.1 release same error happens for me using "packaged" version. Im just trying to index .txt file

len(results)=1, len(file_list)=1
use_quick_index_mode False
reader_mode default
Using reader <kotaemon.loaders.unstructured_loader.UnstructuredReader object at 0x000001C8375C7AC0>
No module named 'unstructured'
Traceback (most recent call last):
  File "D:\software\kotaemon-app\install_dir\env\lib\site-packages\ktem\index\file\pipelines.py", line 724, in stream
    file_id, docs = yield from pipeline.stream(
  File "D:\software\kotaemon-app\install_dir\env\lib\site-packages\ktem\index\file\pipelines.py", line 586, in stream
    docs = self.loader.load_data(file_path, extra_info=extra_info)
  File "D:\software\kotaemon-app\install_dir\env\lib\site-packages\kotaemon\loaders\unstructured_loader.py", line 70, in load_data
    from unstructured.partition.auto import partition
ModuleNotFoundError: No module named 'unstructured'
len(results)=1, len(file_list)=1

sliterok avatar Aug 31 '24 07:08 sliterok

pip install unstructured

solved it for me, this package dependencie should be in the pyproject.toml

alew3 avatar Aug 31 '24 21:08 alew3

Resolved in the new version. Please check the latest docker full.

cin-niko avatar Sep 13 '24 08:09 cin-niko