onyx
onyx copied to clipboard
General issue deleted pages still occur - tested with Confluence integration: Deleted pages in confluence still indexed (even after full re-index)
Hi,
Behavior:
- if a page gets deleted in confluence, the related indexed document is still available after the automatic re-index (10 minutes)
- if a page gets deleted in confluence, the related indexed document is still available after triggering a manual re-index – update only
- if a page gets deleted in confluence, the related indexed document is still available after triggering a manual re-index – full index
Expected behavior:
if a page gets deleted in confluence any update on the connector (automatic, manual update and manual full index) should delete the corresponding indexed document
additional information - the document is deleted in Vespa index; but the chat and document explorer responses still show deleted content. I think that is a general issue, not related to any integration. I still see entries in Postgres. Maybe this needs an update to clean up postgres as well? https://github.com/danswer-ai/danswer/issues/938 https://github.com/danswer-ai/danswer/pull/1086
additional information - Even after the connector deletion, the chat/search/slack bot can answer based on documents indexed by the deleted connector. IMO it's a big issue as even when we improve/clean our source documentation, Danswer doesn't reflect it. @Weves @yuhongsun96
@mboret is there a way for us to reproduce this? That definitely is not intended—once a connector is deleted none of its documents should be searchable.