ragflow icon indicating copy to clipboard operation
ragflow copied to clipboard

[Bug]: can not use api to delete or update doc but can use api to list docs info

Open alien365 opened this issue 6 months ago • 1 comments

Self Checks

  • [x] I have searched for existing issues search for existing issues, including closed ones.
  • [x] I confirm that I am using English to submit this report (Language Policy).
  • [x] Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) (Language Policy).
  • [x] Please do not modify this template :) and fill in all the required fields.

RAGFlow workspace code commit ID

6ca1aef

RAGFlow image version

6ca1aef(v0.19.0)

Other environment information


Actual behavior

连接错误详情: ConnectionError 错误消息: ('Connection aborted.', ConnectionResetError(10054, '远程主机强迫关闭了一个现有的连接。', None, 10054, None))

Expected behavior

No response

Steps to reproduce

url = f"http://{address}/api/v1/datasets/{dataset_id}/documents"

response = requests.delete(
        url,
        headers=headers,
        data=json.dumps(payload),  # 使用json参数自动序列化为JSON并设置Content-Type
        timeout=300  # 设置超时时间
    )

Additional information

No response

alien365 avatar Jun 18 '25 11:06 alien365

Hi @alien365! I'm Dosu and I’m helping the ragflow team.

The ConnectionResetError you’re seeing when deleting or updating documents is likely caused by an uncaught exception or a network/proxy misconfiguration, rather than an application-level bug. The RAGFlow API expects a DELETE request to /api/v1/datasets/{dataset_id}/documents with a JSON body containing the document IDs to delete, and a valid Authorization token. If the request payload or headers don’t match what the API expects, or if the endpoint is incorrect, the backend may hit an unexpected error and forcibly close the connection, resulting in the error you described. This is different from listing documents, which is a simpler operation and less likely to trigger such issues.

A few things to check:

  • Make sure you’re sending the JSON body as the json parameter in requests.delete, not as data=json.dumps(payload). For example:
    response = requests.delete(
        url,
        headers=headers,
        json=payload,  # not data=json.dumps(payload)
        timeout=300
    )
    
  • Confirm the endpoint and HTTP method match the API contract: DELETE to /api/v1/datasets/{dataset_id}/documents with a JSON body like {"ids": ["doc_id1", "doc_id2"]} and the correct Authorization header.
  • Check your network setup, including any proxies, firewalls, or Nginx configs, for anything that might close connections unexpectedly. Connection resets can also happen if the backend process crashes or is killed during the request, or if there’s a timeout or resource limit at the proxy layer (like Nginx’s worker_connections or keepalive_timeout) nginx.conf.
  • If you’re running in Docker, make sure the port mappings and service configs are aligned between your .env and service_conf.yaml.template files source.
  • Try running the RAGFlow server locally (not behind a proxy) to see if the problem persists. If it works locally, the issue is likely with your network or proxy setup source.

If you’ve checked all of the above and still see the error, it’s possible that an unhandled exception deep in the document deletion logic is causing the backend to crash or forcibly close the connection. Improving error logging in the backend could help pinpoint the root cause, but from the outside, double-checking the request format and network environment is the best next step.

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Join Discord Share on X

dosubot[bot] avatar Jun 18 '25 12:06 dosubot[bot]

i changed the code,like this:

# 设置请求体数据
payload` = {"ids": ["4c8c26f44caf11f09d990242ac140006"]}
try:
    # 发送DELETE请求
    response = requests.delete(
        url,
        headers=headers,
        data=payload,  # 使用json参数自动序列化为JSON并设置Content-Type
        timeout=300  # 设置超时时间
    )

but the same error also happened

连接错误详情: ConnectionError
错误消息: ('Connection aborted.', ConnectionResetError(10054, '远程主机强迫关闭了一个现有的连接。', None, 10054, None))

alien365 avatar Jun 19 '25 02:06 alien365

i changed the code,like this:

# 设置请求体数据
payload` = {"ids": ["4c8c26f44caf11f09d990242ac140006"]}
try:
    # 发送DELETE请求
    response = requests.delete(
        url,
        headers=headers,
        data=payload,  # 使用json参数自动序列化为JSON并设置Content-Type
        timeout=300  # 设置超时时间
    )

but the same error also happened

连接错误详情: ConnectionError
错误消息: ('Connection aborted.', ConnectionResetError(10054, '远程主机强迫关闭了一个现有的连接。', None, 10054, None))

it should be json=payload, # not data=json.dumps(payload)

If you use data=payload, actually all the documents in the dataset will be deleted! It's Dangerous.

refer source code https://github.com/infiniflow/ragflow/blob/887651e5fa363422172d592ab7509ebf400e2484/api/apps/sdk/doc.py#L589C5-L589C9

xinzhuang avatar Jun 20 '25 09:06 xinzhuang

@xinzhuang yes,that is my new question ,thanks very much by the way,the reason of this problem is my local environment maybe firewall or encryption software in linux server delete api can works

alien365 avatar Jun 20 '25 11:06 alien365

the reason of this problem is local environment maybe firewall or encryption software

alien365 avatar Jun 20 '25 11:06 alien365