kotaemon icon indicating copy to clipboard operation
kotaemon copied to clipboard

[BUG] bug pylance reappears again in new docker container

Open bennoloeffler opened this issue 8 months ago • 7 comments

Description

uploading to FileIndex or LightRAGIndex gives this:

❌ | FAQ Costumer Service.pdf: The lance library is required to use this function. Please install with pip install pylance.

shows in up in: Upload result

Reproduction steps

just deleted all docker containers.
pull new container.
run it.

docker rmi ghcr.io/cinnamon/kotaemon:main-ollama

docker system prune -a

docker rmi ghcr.io/cinnamon/kotaemon:main-ollama
docker system prune -a
docker run \
-e GRADIO_SERVER_NAME=0.0.0.0 \
-e GRADIO_SERVER_PORT=7860 \
-e USE_LIGHTRAG=true \
-e USE_MS_GRAPHRAG=false \
-e USE_NANO_GRAPHRAG=false \
-e OPENAI_API_KEY=sk-proj... \
-e LOCAL_MODEL=qwen2.5:7b \
-e KH_OLLAMA_URL=http://host.docker.internal:11434/v1/ \
-e COHERE_API_KEY=O... \
-v ./ktem_app_data:/app/ktem_app_data \
-p 7860:7860 -it --rm \
ghcr.io/cinnamon/kotaemon:main-ollama

Screenshots

![DESCRIPTION](LINK.png)

Logs

❯ docker rmi ghcr.io/cinnamon/kotaemon:main-ollama

Untagged: ghcr.io/cinnamon/kotaemon:main-ollama
Untagged: ghcr.io/cinnamon/kotaemon@sha256:4690912fcbce977865cd251c5857b2eed84f0e3a9d973235e4712c534b2de84e
Deleted: sha256:961da42df38794f15cc0af31d724c5be7ea9a87904cb606bdfdb2902123a2ad8
Deleted: sha256:cae0cf7c585e4ae408d0b1580fbaaf77c3bb71940d4f44d7bd7f577a61fbeb84
Deleted: sha256:b96da9dfd1e1e1b1dd2ffd189a64e375a8a7e6dfac231abb9c0a6a02388fecbf
Deleted: sha256:f60f4003ee16a3b77bdfe60f24a029113d44b9ba8863d229f040f436a7fb20e7
Deleted: sha256:90d99af3057ba9de70ee0c637e6a3f19eddb22e5a17f5037025cf39b6a9d9e85
Deleted: sha256:735782a66dcee56296471438f9bba8d93d96a11730bd09a12ec11518748473ad
Deleted: sha256:427e5b048deac30283c481f9561a956f897e6988e9e4c5abd770b6089e5f7d4c
Deleted: sha256:9a0eaef5624836d4e27cde727b7b6ce8dbbb2524af0e09a88f2dfd5da8069826
Deleted: sha256:3cb223d738c0f5320dde685dfc0675ddcc3e694cb217ad59abfc7a8346099b96
Deleted: sha256:32b62e521151610fec394a5452483e763da082f703a43b55f80829e90bfe747b
Deleted: sha256:6a69ea0923a09202c3fb2e13170b6c541de57a72bd0dbd62a5b05ef7197a3f12


❯ docker system prune -a

WARNING! This will remove:
  - all stopped containers
  - all networks not used by at least one container
  - all images without at least one container associated to them
  - all build cache

Are you sure you want to continue? [y/N] y
Total reclaimed space: 0B


❯ docker run \
-e GRADIO_SERVER_NAME=0.0.0.0 \
-e GRADIO_SERVER_PORT=7860 \
-e USE_LIGHTRAG=true \
-e USE_MS_GRAPHRAG=false \
-e USE_NANO_GRAPHRAG=false \
-e OPENAI_API_KEY=sk-proj... \
-e LOCAL_MODEL=qwen2.5:7b \
-e KH_OLLAMA_URL=http://host.docker.internal:11434/v1/ \
-e COHERE_API_KEY=O... \
-v ./ktem_app_data:/app/ktem_app_data \
-p 7860:7860 -it --rm \
ghcr.io/cinnamon/kotaemon:main-ollama
Unable to find image 'ghcr.io/cinnamon/kotaemon:main-ollama' locally
main-ollama: Pulling from cinnamon/kotaemon
d9b636547744: Already exists 
3f529d1f5c64: Already exists 
b18538cbce6f: Already exists 
e3ddea9b7a6f: Already exists 
418dc5b8efe0: Already exists 
911e878595ae: Already exists 
3b623a020b62: Already exists 
126e90fc9556: Already exists 
32437e8bff1b: Already exists 
b37b46190fa9: Already exists 
4f4fb700ef54: Already exists 
c1f160e5f049: Already exists 
8c84ec1c9b2c: Already exists 
c01c3429fa34: Pull complete 
2f083a945858: Pull complete 
a9b4bc91a8a6: Pull complete 
be5d862fb8bb: Pull complete 
9abccc300999: Pull complete 
03f5d8e8d14b: Pull complete 
ee2f2af99441: Pull complete 
235a13e6e02f: Pull complete 
095ffa56f0c0: Pull complete 
b876f1cc4ea5: Pull complete 
Digest: sha256:4690912fcbce977865cd251c5857b2eed84f0e3a9d973235e4712c534b2de84e
Status: Downloaded newer image for ghcr.io/cinnamon/kotaemon:main-ollama
2025/04/03 03:52:45 routes.go:1230: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:2048 OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/root/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
time=2025-04-03T03:52:45.421Z level=INFO source=images.go:432 msg="total blobs: 4"
time=2025-04-03T03:52:45.421Z level=INFO source=images.go:439 msg="total unused blobs removed: 0"
time=2025-04-03T03:52:45.422Z level=INFO source=routes.go:1297 msg="Listening on 127.0.0.1:11434 (version 0.6.3)"
time=2025-04-03T03:52:45.422Z level=INFO source=gpu.go:217 msg="looking for compatible GPUs"
time=2025-04-03T03:52:45.430Z level=INFO source=gpu.go:377 msg="no compatible GPUs were discovered"
time=2025-04-03T03:52:45.430Z level=INFO source=types.go:130 msg="inference compute" id=0 library=cpu variant="" compute="" driver=0.0 name="" total="15.6 GiB" available="14.5 GiB"
[nltk_data] Downloading package punkt_tab to
[nltk_data]     /usr/local/lib/python3.10/site-
[nltk_data]     packages/llama_index/core/_static/nltk_cache...
[nltk_data]   Package punkt_tab is already up-to-date!
GraphRAG dependencies not installed. Try `pip install graphrag future` to install. GraphRAG retriever pipeline will not work properly.
Nano-GraphRAG dependencies not installed. Try `pip install nano-graphrag` to install. Nano-GraphRAG retriever pipeline will not work properly.
INFO:chromadb.telemetry.product.posthog:Anonymized telemetry enabled. See                     https://docs.trychroma.com/telemetry for more information.
INFO:kotaemon:posthog.capture called with args: ('c74792bc-76de-4bf4-ac1a-18af0eb0bb1b', 'ClientStartEvent', {'batch_size': 1, 'in_colab': False, '$process_person_profile': False, 'chroma_version': '0.5.16', 'server_context': 'None', 'hosted': False, 'chroma_api_impl': 'chromadb.api.segment.SegmentAPI', 'is_persistent': False, 'chroma_server_ssl_enabled': False, 'chroma_server_api_default_path': <APIVersion.V2: '/api/v2'>}), kwargs: {}
INFO:kotaemon:posthog.capture called with args: ('c74792bc-76de-4bf4-ac1a-18af0eb0bb1b', 'ClientCreateCollectionEvent', {'batch_size': 1, 'collection_uuid': 'a583d5ff-7f90-4c15-bc18-5d89b7e6ff50', '$process_person_profile': False, 'chroma_version': '0.5.16', 'server_context': 'None', 'hosted': False, 'chroma_api_impl': 'chromadb.api.segment.SegmentAPI', 'is_persistent': False, 'chroma_server_ssl_enabled': False, 'chroma_server_api_default_path': <APIVersion.V2: '/api/v2'>}), kwargs: {}
INFO:kotaemon:posthog.capture called with args: ('c74792bc-76de-4bf4-ac1a-18af0eb0bb1b', 'ClientStartEvent', {'batch_size': 1, 'in_colab': False, '$process_person_profile': False, 'chroma_version': '0.5.16', 'server_context': 'None', 'hosted': False, 'chroma_api_impl': 'chromadb.api.segment.SegmentAPI', 'is_persistent': False, 'chroma_server_ssl_enabled': False, 'chroma_server_api_default_path': <APIVersion.V2: '/api/v2'>}), kwargs: {}
INFO:kotaemon:posthog.capture called with args: ('c74792bc-76de-4bf4-ac1a-18af0eb0bb1b', 'ClientCreateCollectionEvent', {'batch_size': 1, 'collection_uuid': 'a30b48a5-5215-4eab-92a9-446e94c7226a', '$process_person_profile': False, 'chroma_version': '0.5.16', 'server_context': 'None', 'hosted': False, 'chroma_api_impl': 'chromadb.api.segment.SegmentAPI', 'is_persistent': False, 'chroma_server_ssl_enabled': False, 'chroma_server_api_default_path': <APIVersion.V2: '/api/v2'>}), kwargs: {}
INFO:matplotlib.font_manager:generated new fontManager
User "admin" already exists
Setting up quick upload event
Running on local URL:  http://0.0.0.0:7860
INFO:httpx:HTTP Request: GET http://localhost:7860/startup-events "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"

To create a public link, set `share=True` in `launch()`.
User-id: None, can see public conversations: False
User-id: c7a096f435be44d2881fddb1319dba8f, can see public conversations: True
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 536, in process_events
    response = await route_utils.call_process_api(
  File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 276, in call_process_api
    output = await app.get_blocks().process_api(
  File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1923, in process_api
    result = await self.call_function(
  File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1508, in call_function
    prediction = await anyio.to_thread.run_sync(  # type: ignore
  File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2470, in run_sync_in_worker_thread
    return await future
  File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 967, in run
    result = context.run(func, *args)
  File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 818, in wrapper
    response = f(*args, **kwargs)
  File "/app/libs/ktem/ktem/index/file/ui.py", line 475, in delete_event
    self._index._docstore.delete(ds_ids)
  File "/app/libs/kotaemon/kotaemon/storages/docstores/lancedb.py", line 133, in delete
    document_collection.delete(query_filter)
  File "/usr/local/lib/python3.10/site-packages/lancedb/table.py", line 2233, in delete
    LOOP.run(self._table.delete(where))
  File "/usr/local/lib/python3.10/site-packages/lancedb/background_loop.py", line 25, in run
    return asyncio.run_coroutine_threadsafe(future, self.loop).result()
  File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 458, in result
    return self.__get_result()
  File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.10/site-packages/lancedb/table.py", line 3448, in delete
    return await self._inner.delete(where)
RuntimeError: lance error: LanceError(IO): sql parser error: Expected: an expression, found: ), /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/lance-datafusion-0.25.0/src/sql.rs:100:8
use_quick_index_mode False
reader_mode default
Chunk size: None, chunk overlap: None
Using reader <kotaemon.loaders.pdf_loader.PDFThumbnailReader object at 0xfffed99fc100>
/usr/local/lib/python3.10/site-packages/pypdf/_crypt_providers/_cryptography.py:32: CryptographyDeprecationWarning:

ARC4 has been moved to cryptography.hazmat.decrepit.ciphers.algorithms.ARC4 and will be removed from cryptography.hazmat.primitives.ciphers.algorithms in 48.0.0.

Page numbers: 4
Got 4 page thumbnails
Adding documents to doc store
ERROR:ktem.index.file.pipelines:The lance library is required to use this function. Please install with `pip install pylance`.
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/lancedb/table.py", line 1450, in to_lance
    import lance
ModuleNotFoundError: No module named 'lance'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/app/libs/ktem/ktem/index/file/pipelines.py", line 809, in stream
    file_id, docs = yield from pipeline.stream(
  File "/app/libs/ktem/ktem/index/file/pipelines.py", line 649, in stream
    yield from self.handle_docs(docs, file_id, file_name)
  File "/app/libs/ktem/ktem/index/file/pipelines.py", line 391, in handle_docs
    self.handle_chunks_docstore(chunks, file_id)
  File "/app/libs/ktem/ktem/index/file/pipelines.py", line 427, in handle_chunks_docstore
    self.vector_indexing.add_to_docstore(chunks)
  File "/app/libs/kotaemon/kotaemon/indices/vectorindex.py", line 82, in add_to_docstore
    self.doc_store.add(docs)
  File "/app/libs/kotaemon/kotaemon/storages/docstores/lancedb.py", line 56, in add
    document_collection.create_fts_index(
  File "/usr/local/lib/python3.10/site-packages/lancedb/table.py", line 1819, in create_fts_index
    populate_index(
  File "/usr/local/lib/python3.10/site-packages/lancedb/fts.py", line 109, in populate_index
    dataset = table.to_lance()
  File "/usr/local/lib/python3.10/site-packages/lancedb/table.py", line 1452, in to_lance
    raise ImportError(
ImportError: The lance library is required to use this function. Please install with `pip install pylance`.
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
GraphRAG embedding dim 3072
Indexing GraphRAG with LLM ChatOpenAI(api_key=sk-proj-d8vsz23..., base_url=https://api.ope..., frequency_penalty=None, logit_bias=None, logprobs=None, max_retries=None, max_retries_=2, max_tokens=None, model=gpt-4o-mini, n=1, organization=None, presence_penalty=None, stop=None, temperature=0, timeout=20, tool_choice=None, tools=None, top_logprobs=None, top_p=None) and Embedding OpenAIEmbeddings(api_key=sk-proj-d8vsz23..., base_url=https://api.ope..., context_length=8191, dimensions=None, max_retries=None, max_retries_=2, model=text-embedding-..., organization=None, timeout=10)...
INFO:lightrag:Logger initialized for working directory: /app/ktem_app_data/user_data/files/lightrag/35729695-9035-41e9-9108-f57129d39311/input
DEBUG:lightrag:LightRAG init with param:
  working_dir = /app/ktem_app_data/user_data/files/lightrag/35729695-9035-41e9-9108-f57129d39311/input,
  chunk_token_size = 1200,
  chunk_overlap_token_size = 100,
  tiktoken_model_name = gpt-4o-mini,
  entity_extract_max_gleaning = 1,
  entity_summary_to_max_tokens = 500,
  node_embedding_algorithm = node2vec,
  node2vec_params = {'dimensions': 1536, 'num_walks': 10, 'walk_length': 40, 'window_size': 2, 'iterations': 3, 'random_seed': 3},
  embedding_func = {'embedding_dim': 3072, 'max_token_size': 8192, 'func': <function get_embedding_func.<locals>.embedding_func at 0xffff5a07c430>},
  embedding_batch_num = 32,
  embedding_func_max_async = 16,
  llm_model_func = <function get_llm_func.<locals>.llm_func at 0xffff5a07c4c0>,
  llm_model_name = meta-llama/Llama-3.2-1B-Instruct,
  llm_model_max_token_size = 32768,
  llm_model_max_async = 16,
  key_string_value_json_storage_cls = <class 'lightrag.storage.JsonKVStorage'>,
  vector_db_storage_cls = <class 'lightrag.storage.NanoVectorDBStorage'>,
  vector_db_storage_cls_kwargs = {},
  graph_storage_cls = <class 'lightrag.storage.NetworkXStorage'>,
  enable_llm_cache = True,
  addon_params = {},
  convert_response_to_json_func = <function convert_response_to_json at 0xffff04598790>

INFO:lightrag:Load KV full_docs with 0 data
INFO:lightrag:Load KV text_chunks with 0 data
INFO:lightrag:Load KV llm_response_cache with 0 data
INFO:nano-vectordb:Init {'embedding_dim': 3072, 'metric': 'cosine', 'storage_file': '/app/ktem_app_data/user_data/files/lightrag/35729695-9035-41e9-9108-f57129d39311/input/vdb_entities.json'} 0 data
INFO:nano-vectordb:Init {'embedding_dim': 3072, 'metric': 'cosine', 'storage_file': '/app/ktem_app_data/user_data/files/lightrag/35729695-9035-41e9-9108-f57129d39311/input/vdb_relationships.json'} 0 data
INFO:nano-vectordb:Init {'embedding_dim': 3072, 'metric': 'cosine', 'storage_file': '/app/ktem_app_data/user_data/files/lightrag/35729695-9035-41e9-9108-f57129d39311/input/vdb_chunks.json'} 0 data

Browsers

Chrome

OS

No response

Additional information

maybe I do somthing wrong, because that irritates me:

d9b636547744: Already exists 3f529d1f5c64: Already exists b18538cbce6f: Already exists e3ddea9b7a6f: Already exists 418dc5b8efe0: Already exists 911e878595ae: Already exists 3b623a020b62: Already exists 126e90fc9556: Already exists 32437e8bff1b: Already exists b37b46190fa9: Already exists 4f4fb700ef54: Already exists c1f160e5f049: Already exists 8c84ec1c9b2c: Already exists c01c3429fa34: Pull complete 2f083a945858: Pull complete a9b4bc91a8a6: Pull complete be5d862fb8bb: Pull complete 9abccc300999: Pull complete 03f5d8e8d14b: Pull complete ee2f2af99441: Pull complete 235a13e6e02f: Pull complete 095ffa56f0c0: Pull complete b876f1cc4ea5: Pull complete Digest: sha256:4690912fcbce977865cd251c5857b2eed84f0e3a9d973235e4712c534b2de84e Status: Downloaded newer image for ghcr.io/cinnamon/kotaemon:main-ollama

bennoloeffler avatar Apr 03 '25 03:04 bennoloeffler

@bennoloeffler likely you have to docker rmi ghcr.io/cinnamon/kotaemon:main-full as well since it is using cached layers from full image to pull ollama image.

taprosoft avatar Apr 03 '25 05:04 taprosoft

@bennoloeffler likely you have to docker rmi ghcr.io/cinnamon/kotaemon:main-full as well since it is using cached layers from full image to pull ollama image.

I did: docker rm -f $(docker ps -aq) # Remove all containers docker rmi -f $(docker images -aq) # Remove all images

I checked with docker ps -aq docker images

Then docker run ...

No messages like "d9b636547744: Already exists" So every cached stuff was gone before the new run.

BUT Error appeared again :-( Any further hint, what to investigate or try?

bennoloeffler avatar Apr 03 '25 20:04 bennoloeffler

https://cinnamon.github.io/kotaemon/online_install/ You can try this HF Space method which also used latest Docker image to build and confirm if it works okay.

taprosoft avatar Apr 03 '25 23:04 taprosoft

I tried with linux, x86, WORKS. I tried your suggestion: https://huggingface.co/spaces/bennoloeffler/kotaemon_template WORKS. On my mac, silicon M1, FAILS. Even after deleting everything in docker.

So i assume, the problem is plattform dependent, because I had a friend watch me purging everything in oder to not forget or miss something.

bennoloeffler avatar Apr 05 '25 12:04 bennoloeffler

This happens the first time I have installed the container on my system. MacOS with M1.

So seems Platform dependent for me as well.

Here's my cmd output:

use_quick_index_mode False
reader_mode default
Chunk size: 10000, chunk overlap: 100
Using reader TxtReader()
Got 0 page thumbnails
Adding documents to doc store
ERROR:ktem.index.file.pipelines:The lance library is required to use this function. Please install with `pip install pylance`.
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/lancedb/table.py", line 1450, in to_lance
    import lance
ModuleNotFoundError: No module named 'lance'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/app/libs/ktem/ktem/index/file/pipelines.py", line 809, in stream
    file_id, docs = yield from pipeline.stream(
  File "/app/libs/ktem/ktem/index/file/pipelines.py", line 649, in stream
    yield from self.handle_docs(docs, file_id, file_name)
  File "/app/libs/ktem/ktem/index/file/pipelines.py", line 391, in handle_docs
    self.handle_chunks_docstore(chunks, file_id)
  File "/app/libs/ktem/ktem/index/file/pipelines.py", line 427, in handle_chunks_docstore
    self.vector_indexing.add_to_docstore(chunks)
  File "/app/libs/kotaemon/kotaemon/indices/vectorindex.py", line 82, in add_to_docstore
    self.doc_store.add(docs)
  File "/app/libs/kotaemon/kotaemon/storages/docstores/lancedb.py", line 56, in add
    document_collection.create_fts_index(
  File "/usr/local/lib/python3.10/site-packages/lancedb/table.py", line 1819, in create_fts_index
    populate_index(
  File "/usr/local/lib/python3.10/site-packages/lancedb/fts.py", line 109, in populate_index
    dataset = table.to_lance()
  File "/usr/local/lib/python3.10/site-packages/lancedb/table.py", line 1452, in to_lance
    raise ImportError(
ImportError: The lance library is required to use this function. Please install with `pip install pylance`.

philipsometimescodes avatar Apr 07 '25 09:04 philipsometimescodes

Same for me like @philipsometimescodes

2025-04-10 20:12:11 rag-1  | Got 0 page thumbnails
2025-04-10 20:12:11 rag-1  | Adding documents to doc store
2025-04-10 20:12:11 rag-1  | [2025-04-10T18:12:11Z WARN  lance::dataset::write::insert] No existing dataset at /app/ktem_app_data/user_data/docstore/index_1.lance, it will be created
2025-04-10 20:12:11 rag-1  | ERROR:ktem.index.file.pipelines:The lance library is required to use this function. Please install with `pip install pylance`.
2025-04-10 20:12:11 rag-1  | Traceback (most recent call last):
2025-04-10 20:12:11 rag-1  |   File "/usr/local/lib/python3.10/site-packages/lancedb/table.py", line 1450, in to_lance
2025-04-10 20:12:11 rag-1  |     import lance
2025-04-10 20:12:11 rag-1  | ModuleNotFoundError: No module named 'lance'
2025-04-10 20:12:11 rag-1  | 
2025-04-10 20:12:11 rag-1  | During handling of the above exception, another exception occurred:
2025-04-10 20:12:11 rag-1  | 
2025-04-10 20:12:11 rag-1  | Traceback (most recent call last):
2025-04-10 20:12:11 rag-1  |   File "/app/libs/ktem/ktem/index/file/pipelines.py", line 809, in stream
2025-04-10 20:12:11 rag-1  |     file_id, docs = yield from pipeline.stream(
2025-04-10 20:12:11 rag-1  |   File "/app/libs/ktem/ktem/index/file/pipelines.py", line 649, in stream
2025-04-10 20:12:11 rag-1  |     yield from self.handle_docs(docs, file_id, file_name)
2025-04-10 20:12:11 rag-1  |   File "/app/libs/ktem/ktem/index/file/pipelines.py", line 391, in handle_docs
2025-04-10 20:12:11 rag-1  |     self.handle_chunks_docstore(chunks, file_id)
2025-04-10 20:12:11 rag-1  |   File "/app/libs/ktem/ktem/index/file/pipelines.py", line 427, in handle_chunks_docstore
2025-04-10 20:12:11 rag-1  |     self.vector_indexing.add_to_docstore(chunks)
2025-04-10 20:12:11 rag-1  |   File "/app/libs/kotaemon/kotaemon/indices/vectorindex.py", line 82, in add_to_docstore
2025-04-10 20:12:11 rag-1  |     self.doc_store.add(docs)
2025-04-10 20:12:11 rag-1  |   File "/app/libs/kotaemon/kotaemon/storages/docstores/lancedb.py", line 56, in add
2025-04-10 20:12:11 rag-1  |     document_collection.create_fts_index(
2025-04-10 20:12:11 rag-1  |   File "/usr/local/lib/python3.10/site-packages/lancedb/table.py", line 1819, in create_fts_index
2025-04-10 20:12:11 rag-1  |     populate_index(
2025-04-10 20:12:11 rag-1  |   File "/usr/local/lib/python3.10/site-packages/lancedb/fts.py", line 109, in populate_index
2025-04-10 20:12:11 rag-1  |     dataset = table.to_lance()
2025-04-10 20:12:11 rag-1  |   File "/usr/local/lib/python3.10/site-packages/lancedb/table.py", line 1452, in to_lance
2025-04-10 20:12:11 rag-1  |     raise ImportError(
2025-04-10 20:12:11 rag-1  | ImportError: The lance library is required to use this function. Please install with `pip install pylance`.

installing pylance in the docker container using pip install pylance resolves this issue. Is it somehow possible that not all requirements are installed?

kimamil avatar Apr 10 '25 18:04 kimamil

Same for me on MacOS with M1

rafo avatar Apr 29 '25 10:04 rafo