kotaemon icon indicating copy to clipboard operation
kotaemon copied to clipboard

[BUG] Documents persist in user_data after UI deletion

Open Lee-Ju-Yeong opened this issue 1 year ago • 3 comments

Description

When deleting documents through the Kotaemon UI, while the documents are removed from the interface, they continue to persist in the actual file system under the user_data directory. Specifically, the documents remain stored in the LanceDB data directory (/ktem_app_data/user_data/docstore/index_1.lance/data). This creates a discrepancy between the UI state and the actual data storage, potentially leading to storage bloat and inconsistent state management

Reproduction steps

Start Kotaemon application

# optional (setup env)
conda create -n kotaemon python=3.10
conda activate kotaemon

pip install -e "libs/kotaemon[all]"
pip install -e "libs/ktem"

python app.py

Upload any document through the UI
Delete the document using the UI delete function
Check the contents of /ktem_app_data/user_data/docstore/index_1.lance/data
Observe that the document data still exists in the directory despite being deleted from the UI

Screenshots

![DESCRIPTION](LINK.png)

Logs

no logs. no error message.

Browsers

Chrome

OS

MacOS

Additional information

When a document is deleted through the UI, it should be completely removed from both the UI and the underlying storage system The /ktem_app_data/user_data/docstore/index_1.lance/data directory should be properly cleaned up

Lee-Ju-Yeong avatar Nov 15 '24 05:11 Lee-Ju-Yeong

I have the same bug! already fix?

maruyamayasuaki avatar Mar 29 '25 08:03 maruyamayasuaki

Related https://github.com/Cinnamon/kotaemon/issues/493. Planned fixes will be in the next release.

taprosoft avatar Mar 31 '25 08:03 taprosoft

does it affect qdrant/milvus too?

gilbrotheraway avatar Mar 31 '25 19:03 gilbrotheraway