ragflow icon indicating copy to clipboard operation
ragflow copied to clipboard

[Question]: How can I switch the default object storage, minio, to OSS object storage?

Open Monkeycrop opened this issue 1 month ago • 29 comments

Self Checks

  • [x] I have searched for existing issues search for existing issues, including closed ones.
  • [x] I confirm that I am using English to submit this report (Language Policy).
  • [x] Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) (Language Policy).
  • [x] Please do not modify this template :) and fill in all the required fields.

Describe your problem

How can I switch the default object storage, minio, to OSS object storage?

I simply deleted the minio configuration file from the YAML file and added OSS connection information. However, it threw an error during startup, saying that the minio connection configuration information was missing. @dosu

Monkeycrop avatar Nov 04 '25 02:11 Monkeycrop

To switch from MinIO to OSS object storage in RAGFlow, you need to do two things:

  1. In your YAML config, add an oss section with your OSS credentials (see below for an example) and remove or comment out the minio section.
  2. Set the environment variable STORAGE_IMPL=OSS before starting RAGFlow. If you skip this step, RAGFlow will still look for MinIO config and throw an error if it's missing.

Example OSS config block for your YAML file:

oss:
  access_key: 'your_access_key'
  secret_key: 'your_secret_key'
  endpoint_url: 'http://oss-cn-hangzhou.aliyuncs.com'
  region: 'cn-hangzhou'
  bucket: 'your_bucket_name'

The error you saw happened because the default is STORAGE_IMPL=MINIO, so the code still tries to load the MinIO config even if you removed it. Setting STORAGE_IMPL=OSS tells RAGFlow to use your OSS config instead.
See config details
See YAML example

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Chat with Dosu Join Discord Share on X

dosubot[bot] avatar Nov 04 '25 02:11 dosubot[bot]

The configuration was successful, but uploading files results in an error, and there are no error logs on the backend. @dosu

Image

Monkeycrop avatar Nov 04 '25 10:11 Monkeycrop

A recent fix addressed a bug in the OSS upload logic: the put() method in the OSS connector was updated to accept an optional tenant_id argument. If your deployment is missing this update, uploads can fail with a 500 error and no backend logs, since the error is handled internally and not always logged at the API level. Make sure your RAGFlow version includes this fix (PR #10712, merged 2025-10-22).

Also, a recent PR improved backend logging for errors (PR #10836, merged 2025-10-28). If you still see no logs, double-check your logging configuration and ensure your deployment includes these updates.

If you're already on a recent version, double-check your OSS config for typos or missing fields, and try uploading a small, supported file type to rule out file size or format issues. If the problem persists, enabling debug logging may help surface more details.

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Chat with Dosu Join Discord Share on X

dosubot[bot] avatar Nov 04 '25 10:11 dosubot[bot]

I've tried the methods described above, but it still doesn't work.

One thing I need to tell you is that I previously used local Minio to upload some documents for this project. However, I now need to switch to OSS, and I'm wondering if this is related to my previous Minio implementation. @dosu

Monkeycrop avatar Nov 04 '25 13:11 Monkeycrop

Yes, your previous MinIO usage is likely related to the issue. RAGFlow does not automatically migrate or bridge files between storage backends. When you switch from MinIO to OSS, only new uploads go to OSS; files uploaded while using MinIO remain in MinIO and are not accessible through OSS unless you manually migrate them. The file metadata in the database (like location and source_type) does not track which backend was used, so if you have references to files that only exist in MinIO, they will not be found after switching to OSS, which can cause errors or missing files during upload or access attempts. RAGFlow does not support mixed backends or automatic migration—manual export/import or copying objects from MinIO to OSS is required for a seamless transition. For directory structure differences, set the OSS prefix_path to dynamic if you are migrating data to match MinIO's layout (source).

If you are seeing 500 errors on upload after switching, double-check your OSS config, ensure your deployment includes recent fixes, and confirm that your new uploads are not referencing legacy MinIO data. If you need to access old files, you will need to migrate them from MinIO to OSS manually.

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Chat with Dosu Join Discord Share on X

dosubot[bot] avatar Nov 04 '25 13:11 dosubot[bot]

So how should I handle this? Do I need to delete all file data related to minio in the MySQL database, or just delete the file data in Elasticsearch? Please tell me specifically how to do this.

My requirement is that switching to the OSS data source allows for normal file uploading, downloading, and parsing. The previous minio data can be deleted. @dosu

Monkeycrop avatar Nov 05 '25 03:11 Monkeycrop

To fully remove MinIO-related file data after switching to OSS, you need to clean up both MySQL and Elasticsearch. RAGFlow stores file metadata and references in MySQL tables (like file, document, and related tables with a tenant_id field) and in Elasticsearch indices (usually named with your tenant or knowledge base ID as a suffix). Deleting only from one system will leave stale references and can cause errors.

Here’s how to do it:

  1. MySQL: Delete all records in tables such as file, document, and any others with a tenant_id field that reference your old MinIO data. This removes metadata and prevents the system from looking for files that only exist in MinIO. You can do this with SQL delete statements filtered by tenant or knowledge base, or use the RAGFlow service layer methods if you want to avoid breaking relationships [source].

  2. Elasticsearch: Remove indices related to your old files or knowledge bases. These indices are usually named with your tenant or knowledge base ID as a suffix. Deleting these indices ensures that search and retrieval won’t reference missing MinIO files [source].

  3. MinIO: (Optional, but recommended) Delete the actual objects/buckets in MinIO if you no longer need them [source].

There’s no built-in one-click cleanup, so you’ll need to handle each part separately. After cleanup, new uploads, downloads, and parsing will use OSS only, and you won’t have legacy MinIO data causing issues.

If you want to automate this, look at the FileService and related service classes in the codebase—they provide methods for deleting files and documents safely, including cleaning up both metadata and storage [source].

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Chat with Dosu Join Discord Share on X

dosubot[bot] avatar Nov 05 '25 03:11 dosubot[bot]

It still doesn't seem to work. I redeployed a service using a completely new MySQL, Elasticsearch, OSS, and Redis instance. I didn't use MinIO. However, I'm still getting the same error when uploading files. @dosu Image

Monkeycrop avatar Nov 05 '25 04:11 Monkeycrop

Since you’re still seeing a 500 error after a clean redeployment with new MySQL, Elasticsearch, OSS, and Redis (and no MinIO), the issue is almost certainly with configuration or environment variables, not legacy data.

Here’s what to check:

  1. Make sure the OSS section in your service_conf.yaml is fully uncommented and includes all required fields: access_key, secret_key, endpoint_url, region, and bucket. Even a missing or misspelled field will cause silent failures or 500 errors [example config].
  2. Confirm that the environment variable STORAGE_IMPL is set to OSS in your deployment environment. If it’s missing or set to MINIO, RAGFlow will still expect MinIO config and fail if it’s not present [config logic].
  3. Remove or comment out the minio block in your YAML to avoid conflicts.
  4. Double-check your OSS credentials and endpoint for typos.
  5. Try uploading a small, supported file type (like a simple PDF or TXT) to rule out file type/size issues.
  6. If you still get a 500 error and no backend logs, enable DEBUG logging in your deployment to surface hidden errors—sometimes exceptions are caught and not logged at INFO level.

If all config and environment variables are correct and you’re on a recent RAGFlow version, uploads to OSS should work. If not, the most common root causes are a missing or incomplete OSS config, or the STORAGE_IMPL variable not being set to OSS [source].

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Chat with Dosu Join Discord Share on X

dosubot[bot] avatar Nov 05 '25 04:11 dosubot[bot]

I've checked all the above issues, and they're fine. However, I have one question: Here is my OSS configuration: oss: access_key: '***' secret_key: '****' endpoint_url: 'http://idc-storage-prod-ceph-rgw.info' region: 'eu-west' bucket: 'athena-xxx'

My URL is clearly http://idc-storage-prod-ceph-rgw.info, but why did the upload error URL change to this? 项目任务规划.md: Could not connect to the endpoint URL: "http://athena-xxx.idc-storage-prod-ceph-rgw.info/LiteLLM%E9%A1%B9%E7%9B%AE%E4%BB%BB%E5%8A%A1%E8%A7%84%E5%88%92.md"

I still can't figure out where the problem lies. @dosu

Monkeycrop avatar Nov 05 '25 07:11 Monkeycrop

The reason your upload error URL uses the bucket as a subdomain (e.g., http://athena-xxx.idc-storage-prod-ceph-rgw.info/...) instead of your configured endpoint_url is because RAGFlow’s OSS client is hardcoded to use virtual-hosted-style addressing. This means the bucket name is always placed as a subdomain in the request URL, regardless of how you set endpoint_url in your config. This is controlled by the line Config(s3={"addressing_style": "virtual"}, signature_version='v4') in the code, and there’s currently no configuration option to switch to path-style addressing (where the bucket is part of the path) for OSS in RAGFlow [source].

If your OSS backend (like some Ceph RGW deployments) does not support virtual-hosted-style URLs, uploads will fail unless you patch the code to use path-style addressing or set up a DNS entry so that bucket.endpoint resolves correctly. As a workaround, you can try using the S3 backend with STORAGE_IMPL=AWS_S3 and specifying addressing_style: 'path' in the s3 config block, but this is only supported for S3, not OSS, in current RAGFlow versions [discussion].

If you need path-style support for OSS, you’ll need to modify the code in rag/utils/oss_conn.py to set addressing_style to 'path' instead of 'virtual', or request this as a feature. Otherwise, ensure your OSS endpoint and DNS setup can handle bucket-as-subdomain requests.

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Chat with Dosu Join Discord Share on X

dosubot[bot] avatar Nov 05 '25 07:11 dosubot[bot]

It finally worked! That was the problem; I fixed it as follows: Config(s3={"addressing_style": "path"} @dosu

Monkeycrop avatar Nov 05 '25 09:11 Monkeycrop

Great to hear it’s working now! The root cause was that RAGFlow’s OSS integration defaults to virtual-hosted-style URLs (bucket as subdomain), but your OSS backend (like Ceph RGW) needed path-style URLs. Setting Config(s3={"addressing_style": "path"}) in the OSS config forces path-style addressing, which solved the connection issue. This tip should help others running into similar OSS compatibility problems.

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Chat with Dosu Join Discord Share on X

dosubot[bot] avatar Nov 05 '25 09:11 dosubot[bot]

However, using OSS seems to cause unstable document parsing, with parsing failures occurring intermittently. The following error message appears: 失败 开始于: Wed, 05 Nov 2025 12:29:29 GMT processDuration: 28807.30 s 进度: 20:29:33 Task has been received. 20:29:35 Page(1~100000001): Start to parse. 20:29:35 Page(1~100000001): [ERROR]Internal server error while chunking: [Errno 2] No such file or directory: 服务动态.md 20:29:35 [ERROR][Exception]: [Errno 2] No such file or directory: '服务动态.md' What could be the cause of this? How can it be resolved? @dosu

Image Image

Monkeycrop avatar Nov 05 '25 12:11 Monkeycrop

The intermittent parsing failures with OSS are caused by how RAGFlow handles file fetching and error propagation. When a file fetch from OSS fails—due to a transient network issue, incorrect path, or missing object—the OSS connector logs the error and returns None instead of raising an exception. The downstream parser expects either a local file or in-memory binary data; if it receives None, it sometimes tries to open the file by name as a local path, which leads to [Errno 2] No such file or directory errors referencing the OSS key as a local file path. This is especially likely in recursive parsing calls or when handling embedded files or hyperlinks, and is more common with unstable OSS connectivity or misconfigured addressing styles [source].

To resolve or mitigate this:

  1. Verify file presence in OSS: After uploading, check directly in your OSS backend that the file exists at the expected path/bucket.
  2. Test direct OSS access: Use a tool like aws s3api or your OSS provider's CLI to fetch the file using the same bucket/key as RAGFlow.
  3. Enable debug logging: Set your logging level to DEBUG to surface hidden errors in the backend.
  4. Check for transient network issues: If your OSS is slow or unreliable, increase retries or timeout settings in the OSS connector code.
  5. Patch error handling: Consider modifying the OSS connector so that if a fetch fails, it raises an exception and aborts parsing, rather than returning None and letting the parser fall back to local file access.
  6. Confirm addressing style and encoding: Make sure your OSS config uses the correct addressing style (path for Ceph RGW) and that file names are properly encoded.

If you want to make the code more robust, add a check after fetching from OSS: if the binary is None, abort parsing and log a clear error, rather than proceeding. This will prevent the fallback to local file access and make failures easier to diagnose.

Let me know if you need help with a specific code patch or want more details on where to add these checks.

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Chat with Dosu Join Discord Share on X

dosubot[bot] avatar Nov 05 '25 12:11 dosubot[bot]

Do you mean it might be due to unstable OSS remote connection or network instability?

Then what are some ways to alleviate this? Could you list some specific solutions? @dosu

Monkeycrop avatar Nov 05 '25 12:11 Monkeycrop

You've reached the anonymous usage limit of 10 interactions per month. Your usage will reset on 2025-12-01. If you'd like to use Dosu before then, please register to increase your limit to 100 interactions per month for free.

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Chat with Dosu Join Discord Share on X

dosubot[bot] avatar Nov 05 '25 12:11 dosubot[bot]

@dosu Do you mean it might be due to unstable OSS remote connection or network instability?

Then what are some ways to alleviate this? Could you list some specific solutions?

Monkeycrop avatar Nov 05 '25 12:11 Monkeycrop

Yes, intermittent parsing failures can definitely be caused by unstable OSS connections or network issues. Here are some concrete ways to reduce these problems in RAGFlow:

  • Double-check that files are actually present in OSS after upload, and use your OSS provider's CLI or tools to verify you can fetch them directly.
  • Enable DEBUG logging in RAGFlow to surface hidden backend errors and make it easier to spot transient failures.
  • If your OSS backend or network is unreliable, consider increasing the retry count and timeout in the OSS connector code. By default, RAGFlow only retries once with a 1-second delay, which is often not enough for unstable connections. You can patch the code in rag/utils/oss_conn.py to retry more times or wait longer between attempts (source).
  • Make sure your OSS config uses the correct addressing style (path for Ceph RGW) and that file names are properly encoded.
  • Consider patching the OSS connector so that if a fetch fails, it raises an exception and aborts parsing, rather than returning None and letting the parser try to open a non-existent local file. This will make errors clearer and easier to debug.
  • Monitor your network and OSS backend health—packet loss, high latency, or backend throttling can all cause these symptoms.
  • If you have control over your OSS deployment, try to colocate RAGFlow and OSS in the same region or network to reduce latency and packet loss.

If you need help with a code snippet to increase retries or improve error handling, let me know!

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Chat with Dosu Join Discord Share on X

dosubot[bot] avatar Nov 05 '25 12:11 dosubot[bot]

I modified the upload retries and extended the retry interval in oss_conn.py according to your method. However, it doesn't seem to have much effect; there are still quite a few intermittent upload failures.

How should I analyze this? Is there a problem with my code (e.g., caused by switching OSS object storage), or is there a problem with the OSS service? @dosu

Image

Monkeycrop avatar Nov 06 '25 04:11 Monkeycrop

Your retry logic changes (5 attempts, 3s sleep) are reasonable, but if you’re still seeing frequent intermittent upload failures, it’s likely the root cause is outside your code—either OSS service instability or network issues between RAGFlow and OSS.

To pinpoint the problem, try these steps:

  1. Enable DEBUG logging in RAGFlow to capture detailed error messages during failed uploads.
  2. Use your OSS provider’s CLI or SDK (from the same host as RAGFlow) to upload/download files directly. If these also fail intermittently, the issue is with the OSS service or network.
  3. If CLI uploads are stable but RAGFlow uploads fail, look for concurrency/resource issues in your deployment or unhandled exceptions in the OSS connector.
  4. Check for patterns in failures (e.g., time of day, file size, network load) and monitor network health (latency, packet loss).

If the OSS service or network is unstable, no amount of retry logic will fully solve the problem. If direct OSS uploads are stable, then focus on RAGFlow’s integration—look for resource exhaustion, concurrency bugs, or subtle misconfigurations. Improving error logging and surfacing the exact exception type will help you narrow it down quickly.

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Chat with Dosu Join Discord Share on X

dosubot[bot] avatar Nov 06 '25 04:11 dosubot[bot]

Is the ragflow document upload API (/api/v1/datasets/{kb_id}/documents) executed asynchronously? Does it return the result only after a document upload is successful, or does it return the upload result immediately upon submission of the upload task? @dosu

Monkeycrop avatar Nov 06 '25 07:11 Monkeycrop

The document upload API in RAGFlow processes the file upload and metadata/database insertion synchronously, but if you request document parsing (for example, by including a run parameter), the actual parsing/chunking is handled asynchronously as a background task. The API returns a result immediately after the file is uploaded and the document metadata is inserted—it does not wait for parsing or chunking to finish before responding. If parsing is requested, you’ll need to monitor task progress separately; the upload API won’t block or wait for the background processing to complete.
See implementation details

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Chat with Dosu Join Discord Share on X

dosubot[bot] avatar Nov 06 '25 07:11 dosubot[bot]

Please help me analyze the following error message. I'm using OSS object storage, and I'm parsing 10 files simultaneously through the Ragflow UI. However, some files are still failing to be parsed, and the backend is showing the following error: @dosu

2025-11-06 08:08:26,394 ERROR 21 The view function for 'document.get_image' did not return a valid response. The function either returned None or ended without a return statement. Traceback (most recent call last): File "/ragflow/api/apps/document_app.py", line 603, in get_image response = flask.make_response(STORAGE_IMPL.get(bkt, nm)) File "/ragflow/.venv/lib/python3.10/site-packages/flask/helpers.py", line 173, in make_response return current_app.make_response(args) File "/ragflow/.venv/lib/python3.10/site-packages/flask/app.py", line 1174, in make_response raise TypeError( TypeError: The view function for 'document.get_image' did not return a valid response. The function either returned None or ended without a return statement. 2025-11-06 08:08:26,396 INFO 21 127.0.0.1 - - [06/Nov/2025 08:08:26] "GET /v1/document/image/9d71578cb62311f098823eb8fe9dd145-d9ec9334632446ee HTTP/1.1" 200 - 2025-11-06 08:08:26,486 ERROR 21 fail get athena-local-huyarag/ee88c3c9e5b86217 Traceback (most recent call last): File "/ragflow/rag/utils/oss_conn.py", line 137, in get r = self.conn.get_object(Bucket=bucket, Key=fnm) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 565, in _api_call return self._make_api_call(operation_name, kwargs) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 1021, in _make_api_call raise error_class(parsed_response, operation_name) botocore.errorfactory.NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: Unknown 2025-11-06 08:08:29,522 ERROR 21 fail get athena-local-huyarag/ee88c3c9e5b86217 Traceback (most recent call last): File "/ragflow/rag/utils/oss_conn.py", line 137, in get r = self.conn.get_object(Bucket=bucket, Key=fnm) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 565, in _api_call return self._make_api_call(operation_name, kwargs) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 1021, in _make_api_call raise error_class(parsed_response, operation_name) botocore.errorfactory.NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: Unknown 2025-11-06 08:08:32,560 ERROR 21 fail get athena-local-huyarag/ee88c3c9e5b86217 Traceback (most recent call last): File "/ragflow/rag/utils/oss_conn.py", line 137, in get r = self.conn.get_object(Bucket=bucket, Key=fnm) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 565, in _api_call return self._make_api_call(operation_name, kwargs) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 1021, in _make_api_call raise error_class(parsed_response, operation_name) botocore.errorfactory.NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: Unknown 2025-11-06 08:08:35,598 ERROR 21 fail get athena-local-huyarag/ee88c3c9e5b86217 Traceback (most recent call last): File "/ragflow/rag/utils/oss_conn.py", line 137, in get r = self.conn.get_object(Bucket=bucket, Key=fnm) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 565, in _api_call return self._make_api_call(operation_name, kwargs) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 1021, in _make_api_call raise error_class(parsed_response, operation_name) botocore.errorfactory.NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: Unknown 2025-11-06 08:08:38,634 ERROR 21 fail get athena-local-huyarag/ee88c3c9e5b86217 Traceback (most recent call last): File "/ragflow/rag/utils/oss_conn.py", line 137, in get r = self.conn.get_object(Bucket=bucket, Key=fnm) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 565, in _api_call return self._make_api_call(operation_name, kwargs) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 1021, in _make_api_call raise error_class(parsed_response, operation_name) botocore.errorfactory.NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: Unknown 2025-11-06 08:08:41,648 ERROR 21 The view function for 'document.get_image' did not return a valid response. The function either returned None or ended without a return statement. Traceback (most recent call last): File "/ragflow/api/apps/document_app.py", line 603, in get_image response = flask.make_response(STORAGE_IMPL.get(bkt, nm)) File "/ragflow/.venv/lib/python3.10/site-packages/flask/helpers.py", line 173, in make_response return current_app.make_response(args) File "/ragflow/.venv/lib/python3.10/site-packages/flask/app.py", line 1174, in make_response raise TypeError( TypeError: The view function for 'document.get_image' did not return a valid response. The function either returned None or ended without a return statement. 2025-11-06 08:08:41,650 INFO 21 127.0.0.1 - - [06/Nov/2025 08:08:41] "GET /v1/document/image/9d71578cb62311f098823eb8fe9dd145-ee88c3c9e5b86217 HTTP/1.1" 200 - 2025-11-06 08:08:41,739 ERROR 21 fail get athena-local-huyarag/37df23818b878be7 Traceback (most recent call last): File "/ragflow/rag/utils/oss_conn.py", line 137, in get r = self.conn.get_object(Bucket=bucket, Key=fnm) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 565, in _api_call return self._make_api_call(operation_name, kwargs) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 1021, in _make_api_call raise error_class(parsed_response, operation_name) botocore.errorfactory.NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: Unknown 2025-11-06 08:08:44,778 ERROR 21 fail get athena-local-huyarag/37df23818b878be7 Traceback (most recent call last): File "/ragflow/rag/utils/oss_conn.py", line 137, in get r = self.conn.get_object(Bucket=bucket, Key=fnm) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 565, in _api_call return self._make_api_call(operation_name, kwargs) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 1021, in _make_api_call raise error_class(parsed_response, operation_name) botocore.errorfactory.NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: Unknown 2025-11-06 08:08:47,815 ERROR 21 fail get athena-local-huyarag/37df23818b878be7 Traceback (most recent call last): File "/ragflow/rag/utils/oss_conn.py", line 137, in get r = self.conn.get_object(Bucket=bucket, Key=fnm) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 565, in _api_call return self._make_api_call(operation_name, kwargs) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 1021, in _make_api_call raise error_class(parsed_response, operation_name) botocore.errorfactory.NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: Unknown 2025-11-06 08:08:50,852 ERROR 21 fail get athena-local-huyarag/37df23818b878be7 Traceback (most recent call last): File "/ragflow/rag/utils/oss_conn.py", line 137, in get r = self.conn.get_object(Bucket=bucket, Key=fnm) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 565, in _api_call return self._make_api_call(operation_name, kwargs) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 1021, in _make_api_call raise error_class(parsed_response, operation_name) botocore.errorfactory.NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: Unknown 2025-11-06 08:08:53,892 ERROR 21 fail get athena-local-huyarag/37df23818b878be7 Traceback (most recent call last): File "/ragflow/rag/utils/oss_conn.py", line 137, in get r = self.conn.get_object(Bucket=bucket, Key=fnm) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 565, in _api_call return self._make_api_call(operation_name, kwargs) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 1021, in _make_api_call raise error_class(parsed_response, operation_name) botocore.errorfactory.NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: Unknown 2025-11-06 08:08:56,908 ERROR 21 The view function for 'document.get_image' did not return a valid response. The function either returned None or ended without a return statement. Traceback (most recent call last): File "/ragflow/api/apps/document_app.py", line 603, in get_image response = flask.make_response(STORAGE_IMPL.get(bkt, nm)) File "/ragflow/.venv/lib/python3.10/site-packages/flask/helpers.py", line 173, in make_response return current_app.make_response(args) File "/ragflow/.venv/lib/python3.10/site-packages/flask/app.py", line 1174, in make_response raise TypeError( TypeError: The view function for 'document.get_image' did not return a valid response. The function either returned None or ended without a return statement. 2025-11-06 08:08:56,910 INFO 21 127.0.0.1 - - [06/Nov/2025 08:08:56] "GET /v1/document/image/9d71578cb62311f098823eb8fe9dd145-37df23818b878be7 HTTP/1.1" 200 - 2025-11-06 08:08:57,003 ERROR 21 fail get athena-local-huyarag/b3c863113bb08b52 Traceback (most recent call last): File "/ragflow/rag/utils/oss_conn.py", line 137, in get r = self.conn.get_object(Bucket=bucket, Key=fnm) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 565, in _api_call return self._make_api_call(operation_name, kwargs) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 1021, in _make_api_call raise error_class(parsed_response, operation_name) botocore.errorfactory.NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: Unknown 2025-11-06 08:09:00,041 ERROR 21 fail get athena-local-huyarag/b3c863113bb08b52 Traceback (most recent call last): File "/ragflow/rag/utils/oss_conn.py", line 137, in get r = self.conn.get_object(Bucket=bucket, Key=fnm) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 565, in _api_call return self._make_api_call(operation_name, kwargs) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 1021, in _make_api_call raise error_class(parsed_response, operation_name) botocore.errorfactory.NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: Unknown 2025-11-06 08:09:03,076 ERROR 21 fail get athena-local-huyarag/b3c863113bb08b52 Traceback (most recent call last): File "/ragflow/rag/utils/oss_conn.py", line 137, in get r = self.conn.get_object(Bucket=bucket, Key=fnm) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 565, in _api_call return self._make_api_call(operation_name, kwargs) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 1021, in _make_api_call raise error_class(parsed_response, operation_name) botocore.errorfactory.NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: Unknown 2025-11-06 08:09:06,114 ERROR 21 fail get athena-local-huyarag/b3c863113bb08b52 Traceback (most recent call last): File "/ragflow/rag/utils/oss_conn.py", line 137, in get r = self.conn.get_object(Bucket=bucket, Key=fnm) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 565, in _api_call return self._make_api_call(operation_name, kwargs) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 1021, in _make_api_call raise error_class(parsed_response, operation_name) botocore.errorfactory.NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: Unknown 2025-11-06 08:09:09,148 ERROR 21 fail get athena-local-huyarag/b3c863113bb08b52 Traceback (most recent call last): File "/ragflow/rag/utils/oss_conn.py", line 137, in get r = self.conn.get_object(Bucket=bucket, Key=fnm) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 565, in _api_call return self._make_api_call(operation_name, kwargs) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 1021, in _make_api_call raise error_class(parsed_response, operation_name) botocore.errorfactory.NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: Unknown 2025-11-06 08:09:12,162 ERROR 21 The view function for 'document.get_image' did not return a valid response. The function either returned None or ended without a return statement. Traceback (most recent call last): File "/ragflow/api/apps/document_app.py", line 603, in get_image response = flask.make_response(STORAGE_IMPL.get(bkt, nm)) File "/ragflow/.venv/lib/python3.10/site-packages/flask/helpers.py", line 173, in make_response return current_app.make_response(args) File "/ragflow/.venv/lib/python3.10/site-packages/flask/app.py", line 1174, in make_response raise TypeError( TypeError: The view function for 'document.get_image' did not return a valid response. The function either returned None or ended without a return statement. 2025-11-06 08:09:12,164 INFO 21 127.0.0.1 - - [06/Nov/2025 08:09:12] "GET /v1/document/image/9d71578cb62311f098823eb8fe9dd145-b3c863113bb08b52 HTTP/1.1" 200 - 2025-11-06 08:09:12,228 ERROR 21 fail get athena-local-huyarag/451df198980f7e66 Traceback (most recent call last): File "/ragflow/rag/utils/oss_conn.py", line 137, in get r = self.conn.get_object(Bucket=bucket, Key=fnm) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 565, in _api_call return self._make_api_call(operation_name, kwargs) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 1021, in _make_api_call raise error_class(parsed_response, operation_name) botocore.errorfactory.NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: Unknown 2025-11-06 08:09:15,262 ERROR 21 fail get athena-local-huyarag/451df198980f7e66 Traceback (most recent call last): File "/ragflow/rag/utils/oss_conn.py", line 137, in get r = self.conn.get_object(Bucket=bucket, Key=fnm) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 565, in _api_call return self._make_api_call(operation_name, kwargs) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 1021, in _make_api_call raise error_class(parsed_response, operation_name) botocore.errorfactory.NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: Unknown 2025-11-06 08:09:18,297 ERROR 21 fail get athena-local-huyarag/451df198980f7e66 Traceback (most recent call last): File "/ragflow/rag/utils/oss_conn.py", line 137, in get r = self.conn.get_object(Bucket=bucket, Key=fnm) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 565, in _api_call return self._make_api_call(operation_name, kwargs) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 1021, in _make_api_call raise error_class(parsed_response, operation_name) botocore.errorfactory.NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: Unknown 2025-11-06 08:09:21,332 ERROR 21 fail get athena-local-huyarag/451df198980f7e66 Traceback (most recent call last): File "/ragflow/rag/utils/oss_conn.py", line 137, in get r = self.conn.get_object(Bucket=bucket, Key=fnm) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 565, in _api_call return self._make_api_call(operation_name, kwargs) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 1021, in _make_api_call raise error_class(parsed_response, operation_name) botocore.errorfactory.NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: Unknown 2025-11-06 08:09:24,369 ERROR 21 fail get athena-local-huyarag/451df198980f7e66 Traceback (most recent call last): File "/ragflow/rag/utils/oss_conn.py", line 137, in get r = self.conn.get_object(Bucket=bucket, Key=fnm) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 565, in _api_call return self._make_api_call(operation_name, kwargs) File "/ragflow/.venv/lib/python3.10/site-packages/botocore/client.py", line 1021, in _make_api_call raise error_class(parsed_response, operation_name) botocore.errorfactory.NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: Unknown 2025-11-06 08:09:27,384 ERROR 21 The view function for 'document.get_image' did not return a valid response. The function either returned None or ended without a return statement. Traceback (most recent call last): File "/ragflow/api/apps/document_app.py", line 603, in get_image response = flask.make_response(STORAGE_IMPL.get(bkt, nm)) File "/ragflow/.venv/lib/python3.10/site-packages/flask/helpers.py", line 173, in make_response return current_app.make_response(args) File "/ragflow/.venv/lib/python3.10/site-packages/flask/app.py", line 1174, in make_response raise TypeError( TypeError: The view function for 'document.get_image' did not return a valid response. The function either returned None or ended without a return statement.

Monkeycrop avatar Nov 06 '25 08:11 Monkeycrop

The root cause is that when a file is missing in OSS (triggering a NoSuchKey error), the OSS connector logs the error and returns None, but the Flask view (like document.get_image) does not check for None and tries to create a response from it, causing the TypeError you see. This usually means the file was never uploaded, failed to upload, or is being requested before it is available in OSS—especially likely under concurrent parsing or network instability.

To fix this and get clearer errors:

  • Patch your Flask view functions to check if STORAGE_IMPL.get returns None. If so, return a 404 or a user-friendly error instead of passing None to flask.make_response. Example:
    data = settings.STORAGE_IMPL.get(bkt, nm)
    if data is None:
        return flask.abort(404, description="File not found in OSS")
    response = flask.make_response(data)
    response.headers.set("Content-Type", "image/JPEG")
    return response
    
  • For reliability, verify that files exist in OSS immediately after upload, especially before starting parsing tasks. Use your OSS CLI or SDK to check.
  • If you still see missing files, investigate possible race conditions: ensure uploads fully complete before parsing starts, and check for OSS eventual consistency issues under load.
  • Improving the OSS connector to raise explicit exceptions on missing files (instead of returning None) will also help make errors more obvious and easier to debug.

This approach will prevent Flask from returning invalid responses and make it clear when files are missing or uploads are failing, rather than masking the issue with a generic server error.
See code details
OSS connector behavior

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Chat with Dosu Join Discord Share on X

dosubot[bot] avatar Nov 06 '25 08:11 dosubot[bot]

Does Ragflow's Elasticsearch support multiple nodes?

I've been troubleshooting the intermittent document parsing failures, and I suddenly realized that my Elasticsearch is a multi-node system. Could that be the cause? @dosu

Monkeycrop avatar Nov 12 '25 08:11 Monkeycrop

Yes, RAGFlow fully supports multi-node Elasticsearch clusters. The ES connection logic in RAGFlow is designed to accept a comma-separated list of hosts, and it uses the official Elasticsearch Python client, which handles cluster discovery, failover, and load balancing across nodes. This is a standard and recommended way to connect to ES clusters for scalability and reliability (source).

Intermittent document parsing failures are not caused by using a multi-node ES cluster itself. The most common causes are network/DNS issues, misconfigured ES hostnames, or cluster health problems. For example, if a node hostname is not resolvable or a node is down, you may see timeouts or connection errors during parsing (source). Make sure your ES cluster is healthy, all node hostnames are reachable from your RAGFlow deployment, and your hosts config is correct.

If you continue to see intermittent failures, check your logs for ES connection errors, and consider running cluster health checks or increasing the number of ES replicas for better fault tolerance. The cluster setup itself is not a limitation—RAGFlow is built to work with both single-node and multi-node Elasticsearch deployments (source).

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Chat with Dosu Join Discord Share on X

dosubot[bot] avatar Nov 12 '25 08:11 dosubot[bot]

You solved?

Magicbook1108 avatar Nov 27 '25 02:11 Magicbook1108

Not resolved

Monkeycrop avatar Nov 27 '25 03:11 Monkeycrop

I’m afraid I’m unable to help with this issue, as it is beyond my current capability.

If the issue persists, I suggest opening a new issue and labeling it as bug, so that more people have a chance to see and address it. For now, I will need to close this issue.

Magicbook1108 avatar Dec 17 '25 03:12 Magicbook1108