[Bug]: Inconsistent State After Cancelling Document Parsing via API
After calling async_cancel_parse_documents on a batch of documents, a few documents may end up with a run value of "CANCEL" while progress is greater than 0. This subsequently causes an error in the async_parse_documents interface.
https://github.com/infiniflow/ragflow/blob/ce816edb5ff0a2b5a2f3ca7912ba9dc881c7ca16/api/apps/sdk/doc.py#L733
Meanwhile, in the web UI, since the document's progress is updated first, it results in a situation where the API fails but the interface appears successful.
https://github.com/infiniflow/ragflow/blob/ce816edb5ff0a2b5a2f3ca7912ba9dc881c7ca16/api/apps/document_app.py#L356
The inconsistent state you’re seeing—where documents have a run value of cancel but a progress greater than 0 after calling async_cancel_parse_documents—stems from how state updates are handled between the API and the web UI.
Here’s what’s happening:
- The async_cancel_parse_documents function is designed to stop parsing for specified documents by sending a DELETE request to the backend. On success, it expects the document’s run status to be set to CANCEL and progress to be reset to 0.0. This is also what the test suite validates as the correct post-cancellation state (source).
- However, the document_app.py file (which powers the web UI) updates the document’s progress and run status independently. When a cancellation is triggered from the UI, it sets the run status to CANCEL but may not always reset the progress to 0, especially if the progress update happens before the backend cancellation is fully processed (source).
- The update_doc function in api/apps/sdk/doc.py enforces that the progress value cannot be arbitrarily changed. If the requested progress does not match the current document progress, it returns an error (source). This can cause the API to fail if the UI and backend get out of sync.
This mismatch leads to the situation where the UI appears to have successfully cancelled the operation (because it updates the run status), but the backend still sees a nonzero progress, causing subsequent API calls (like async_parse_documents) to fail due to the inconsistent state.
I did not find any existing tickets or discussions tracking this specific issue.
Summary of relevant code behavior:
- async_cancel_parse_documents expects both run = CANCEL and progress = 0.0 after cancellation (source).
- The web UI may update run and progress independently, leading to a misleading state (source).
- The backend enforces strict checks on progress updates, which can cause errors if the UI and backend are not synchronized (source).
This appears to be a race condition or state synchronization issue between the UI and backend. Ensuring that both run and progress are updated atomically on cancellation—both in the UI and backend—should resolve the inconsistency.
To continue talking to Dosu, mention @dosu-bot.
Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other