kernel-memory icon indicating copy to clipboard operation
kernel-memory copied to clipboard

[Bug] Wrong status of document in poisoned queue

Open KurtP20 opened this issue 11 months ago • 1 comments

Context / Scenario

When ingesting an invalid URL, e.g. ImportWebPageAsync("http://malformed_url") KM places the document in the poisoned queue after some attempts: Microsoft.KernelMemory.Pipeline.Queue.DevTools.SimpleQueues[0] Message '20250124.114916.8130921.4d6c0b1c4b4d41ff84a0cb26ac27abe8' processing failed with exception, max attempts reached, moving to poison queue.. But the status reported by GetDocumentStatusAsync is still as it was before (my log message: Document 416A1AABBD2B38AE93197949C710199DC83695E497F514EFA5097173535AE492 null?:False completed:False empty:False remaining steps:extract, partition, gen_embeddings, save_records ready:False).

It would be nice to have an additional field failed in DataPipelineStatus, maybe even with a message-field why it failed. Since one most likely wants to delete the failed document, it would be nice to include an optional flag deleteUponFailure to ImportWebPageAsync (or the other Import* methods).

What happened?

Status reports URL is still ingesting, while it is in the poisoned queue.

Importance

a fix would make my life easier

Platform, Language, Versions

KernelMemory 0.95 kernelmemory/service created 2025-01-20T15:41:17.539712455Z C# / .net9

Relevant log output


KurtP20 avatar Jan 24 '25 12:01 KurtP20