onyx icon indicating copy to clipboard operation
onyx copied to clipboard

General issue deleted pages still occur - tested with Confluence integration: Deleted pages in confluence still indexed (even after full re-index)

Open vchaindz opened this issue 11 months ago • 3 comments

Hi,

Behavior:

  • if a page gets deleted in confluence, the related indexed document is still available after the automatic re-index (10 minutes)
  • if a page gets deleted in confluence, the related indexed document is still available after triggering a manual re-index – update only
  • if a page gets deleted in confluence, the related indexed document is still available after triggering a manual re-index – full index

Expected behavior:

if a page gets deleted in confluence any update on the connector (automatic, manual update and manual full index) should delete the corresponding indexed document

vchaindz avatar Mar 17 '24 08:03 vchaindz

additional information - the document is deleted in Vespa index; but the chat and document explorer responses still show deleted content. I think that is a general issue, not related to any integration. I still see entries in Postgres. Maybe this needs an update to clean up postgres as well? https://github.com/danswer-ai/danswer/issues/938 https://github.com/danswer-ai/danswer/pull/1086

vchaindz avatar Mar 18 '24 13:03 vchaindz

additional information - Even after the connector deletion, the chat/search/slack bot can answer based on documents indexed by the deleted connector. IMO it's a big issue as even when we improve/clean our source documentation, Danswer doesn't reflect it. @Weves @yuhongsun96

mboret avatar Apr 19 '24 14:04 mboret

@mboret is there a way for us to reproduce this? That definitely is not intended—once a connector is deleted none of its documents should be searchable.

Weves avatar Apr 22 '24 05:04 Weves