Flowise [BUG]Error: 400 Bad Request: Payload error: JSON payload (95879532 bytes) is larger than allowed (limit: 33554432 bytes).

Describe the bug Cannot upsert big documents with more than 100k words. if the document is in docx format get the error "Error: 400 Bad Request: Payload error: JSON payload (95879532 bytes) is larger than allowed (limit: 33554432 bytes)."

If the same file is converted pdf or txt file than get this error "Error: Undefined error code fetch failed: undefined"

In all these cases the vector collection is being created in the vector DB but the collection is empty.

To Reproduce Its a locally hosted environment using mxbai_large as the embedding model and tinyllama as the LLM. Qdrant is being used as the vector DB. However, smaller document documents gets embedded and upserted successfully whether they are docx/pdf/txt file using the same workflow without any error

Expected behavior successful upsert of document

Screenshots docx upsert error:

pdf upsert error:

txt upsert error: 4-text

mxbai_embedding logs:

Flow pdf extract vector Chatflow.json

Setup

Installation: Docker 25.0.3
Flowise Version 1.6.0 linux-x64 node-v18.19.1
OS: windows
Browser chrome

Additional context Even if the upsert process fails the embedding llm does undertake the process of embedding...its only when the entire document is embedded that this error comes. additionally getting this warning [WARNING]: Importing from "langchain/schema" is deprecated.Instead, please import from the appropriate entrypoint in "@langchain/core" or "langchain".This will be mandatory after the next "langchain" minor version bump to 0.2.

Apr 15 '24 16:04 prithvi151080

Updated the flowise to version 1.6.5 and still the same problem persists.

however after the flowise version update to 1.6.5 pdf & txt file which initially was showing unknown error are now giving the same error of JSON file limit error. By default the default file limit is 50mb (https://github.com/FlowiseAI/Flowise/issues/96) however i am getting an error where in the limit is specified as 33.55 MB

PDF file upsert error:

txt file upsert error:

docx file upsert error:

Apr 16 '24 14:04 prithvi151080

Followed the instruction as per this issue (https://github.com/FlowiseAI/Flowise/issues/96) and increased the file limit to 300 mb.

still the error persists...plz help

Apr 16 '24 17:04 prithvi151080

duplicate issue with this #1824

Apr 18 '24 09:04 mrabbah

@mrabbah: Thanx for the comment...i thought that i am the only one facing this issue....interestingly i am not getting this error with a chunk size of 1000 for the same document and it works fine with that. My preferred chunk size is 250-300 which produces this error.

Apr 18 '24 10:04 prithvi151080

The problem has been sorted. The issue was with Qdrant which had a default value of 32 MB (33554432 byte) limitation for Maximum size of POST data in a single request in megabytes. Increasing the same in qdrant config.yaml has sorted this problem and upserting is now working fine. BDW the document that i was trying to upsert had more than 4500 vectors (for 250 chunk size with 10 overlap). No change in flowise required to solve this issue

Apr 22 '24 13:04 prithvi151080

@mrabbah: plz configure Qdrant with required file size for a single upload and the problem goes away..

Apr 22 '24 13:04 prithvi151080

@prithvi151080 I didn't find any flags or environment variable that I can set to change the Max JSON payload size accepted by Qdrant!

Apr 22 '24 13:04 mrabbah

This change u have to do in the qdrant vector DB...not in flowise...create a custom_config.yaml file and under services set Maximum size of POST data in a single request in megabytes to the required file size as per ur use case and the problem goes away. use this config file as the master config for ur Qdrant instance. Qdrant config link details: https://qdrant.tech/documentation/guides/configuration/

Apr 22 '24 14:04 prithvi151080

@mrabbah: Hope this helps :)

Apr 22 '24 14:04 prithvi151080