dialoqbase icon indicating copy to clipboard operation
dialoqbase copied to clipboard

I found a flaw: "Bot is cooking for too long."

Open yoobaring opened this issue 2 years ago • 21 comments
trafficstars

image

Hi @n4ze3m I have encountered an issue. The bot is cooking for too long. I've found this problem in documents with multiple pages, but for a small number of pages, I haven't encountered this issue. I hope this issue will be resolved. Please note that I'm running tests on a railway.

yoobaring avatar Nov 03 '23 12:11 yoobaring

Hi @n4ze3m I have encountered an issue. The bot is cooking for too long. I've found this problem in documents with multiple pages, but for a small number of pages, I haven't encountered this issue. I hope this issue will be resolved. Please note that I'm running tests on a railway.

This is the same issue I've encountered as well. I've opened a ticket to report this problem before. I believe it won't be long before it gets fixed. Let's wait for the next update...

chalitbkb avatar Nov 03 '23 12:11 chalitbkb

which data source is causing the issue, the docx or the pdf? I know this issue occurs with the railway and works fine locally. I will be looking into a solution

n4ze3m avatar Nov 03 '23 13:11 n4ze3m

which data source is causing the issue, the docx or the pdf? I know this issue occurs with the railway and works fine locally. I will be looking into a solution

Both

yoobaring avatar Nov 03 '23 13:11 yoobaring

I have updated the railway template, which may fix the file processing issues.

n4ze3m avatar Nov 18 '23 05:11 n4ze3m

@n4ze3m I waited 3 hours and still got the same problem. Nothing has changed. The problem has not been completely resolved. My file has approximately 500-1000 pages. How many pages did you test the document for? Please try 500-1000 or more pages and you will encounter this problem.

image

yoobaring avatar Nov 18 '23 17:11 yoobaring

Is this issue related to the railway or is it local?

For Railway,I think you need to reinstall the railway template. The old one doesn't have a Docker mount, which may be causing the issue.

n4ze3m avatar Nov 18 '23 18:11 n4ze3m

Is this issue related to the railway or is it local?

For Railway,I think you need to reinstall the railway template. The old one doesn't have a Docker mount, which may be causing the issue.

railway

I tried it and your latest version is 1.4.1.

yoobaring avatar Nov 18 '23 18:11 yoobaring

Hello, can you reinstall your railway app? The latest update has mounted an upload folder, preventing the deletion of uploaded files.

Railway template: https://railway.app/template/TXdjD7

I have tested a 758-page PDF, approximately 17 MB, using Cohere embedding, and it's working without any issue.

PDF I tested: https://www.microsoft.com/en-us/research/uploads/prod/2006/01/Bishop-Pattern-Recognition-and-Machine-Learning-2006.pdf

image

n4ze3m avatar Nov 19 '23 07:11 n4ze3m

Hello, can you reinstall your railway app? The latest update has mounted an upload folder, preventing the deletion of uploaded files.

Railway template: https://railway.app/template/TXdjD7

I have tested a 758-page PDF, approximately 17 MB, using Cohere embedding, and it's working without any issue.

PDF I tested: https://www.microsoft.com/en-us/research/uploads/prod/2006/01/Bishop-Pattern-Recognition-and-Machine-Learning-2006.pdf

@n4ze3m

Please carefully watch the video. Do not fast forward or skip, as there are explanations that you need to read.

https://streamable.com/afx0tc

I have tested on the "Railway" again, and it seems that I am still encountering the same issues. Here are my observations:

  1. When testing with the "Cohere API," there are no issues when using files with the .pdf extension. However, problems arise when working with files in the .docx format.

  2. Testing with the "OpenAI API" reveals problems with files that have multiple pages, including the file you provided me for testing.

yoobaring avatar Nov 19 '23 20:11 yoobaring

When testing with the "Cohere API," there are no issues when using files with the .pdf extension. However, problems arise when working with files in the .docx format.

I will look into it. I think the issue is with the DOCX loader.

Testing with the "OpenAI API" reveals problems with files that have multiple pages, including the file you provided me for testing.

I will test with the OpenAI API, as I think the issue may be caused by a rate limit. I will look into it

Currently, you cannot delete a data source while it is processing. I will update the error label.

n4ze3m avatar Nov 20 '23 05:11 n4ze3m

Hello,

I have released a new update which addresses the issue with the docx loader. This update has been tested on a 700+ page docx document on railways using the text-embedding-ada-002 model.

The processing time for the file is approximately 2-3 minutes.

teste docx link: https://docs.google.com/document/d/18-ETRBO4yRpRl3nF68P8vTbunlBgdy_t/edit?usp=sharing&ouid=108531690400573042017&rtpof=true&sd=true

https://github.com/n4ze3m/dialoqbase/assets/39720973/503f1c6a-08b2-4943-8e27-d723d4e870ed

n4ze3m avatar Nov 23 '23 14:11 n4ze3m

Hello,

I have released a new update which addresses the issue with the docx loader. This update has been tested on a 700+ page docx document on railways using the text-embedding-ada-002 model.

The processing time for the file is approximately 2-3 minutes.

teste docx link: https://docs.google.com/document/d/18-ETRBO4yRpRl3nF68P8vTbunlBgdy_t/edit?usp=sharing&ouid=108531690400573042017&rtpof=true&sd=true

demo.mp4

@n4ze3m No more words from now on. I've been waiting for 1-2 hours, and the problem remains the same. I feel so frustrated, haha :)

image

yoobaring avatar Nov 24 '23 11:11 yoobaring

:| same docs ??

n4ze3m avatar Nov 24 '23 12:11 n4ze3m

:| same docs ??

Yes, I have tried the OPENAI API, Cohere API, Jina API, Llama API, but the problem persists.

image

yoobaring avatar Nov 24 '23 14:11 yoobaring

I'm sorry, I don't fully understand what's happening. If you are using Railway, I highly recommend deleting the existing application and creating a new one from the latest template. I have tested it on a new Railway application.

n4ze3m avatar Nov 24 '23 15:11 n4ze3m

I'm sorry, I don't fully understand what's happening. If you are using Railway, I highly recommend deleting the existing application and creating a new one from the latest template. I have tested it on a new Railway application.

Is it necessary to delete the database on Supabase? I've tried reinstalling the app excluding Supabase and reinstalling it from scratch, but it still doesn't work. Do I need to delete the database to start fresh?

yoobaring avatar Nov 24 '23 15:11 yoobaring

No, Make sure your database has enough space. Embedding takes up a lot of space

I just tested the application on the railway, and it works perfectly for me. Here is the uncut version:

https://github.com/n4ze3m/dialoqbase/assets/39720973/02fbb67a-13f2-4c76-9526-94ef9f6f945e

n4ze3m avatar Nov 24 '23 15:11 n4ze3m

No, Make sure your database has enough space. Embedding takes up a lot of space

I just tested the application on the railway, and it works perfectly for me. Here is the uncut version:

brave_Bf0jqbXYDB.mp4

image

yoobaring avatar Nov 24 '23 15:11 yoobaring

@n4ze3m Alright, I'm going to try applying with all new accounts this time, and we'll see how it goes.

yoobaring avatar Nov 24 '23 15:11 yoobaring

@n4ze3m I have retested, and unfortunately, I still encounter the same issue. I am frustrated with the persistent problem that has not been fully resolved. I hope it can be addressed soon. I am unsure of the root cause of this issue and feel genuinely discouraged.

yoobaring avatar Nov 24 '23 16:11 yoobaring

@yoobaring : I've run into the same issue multiple times while testing on railway and similar services. While everything was working fine on my local environment, there was this issue with large files on cloud services. At the end it was a simple issue of scaling. Just ensure that your runtime environment has at least 4 gigs of ram and 4 dedicated CPUs. To fix the issue temporarily, simply go into the database table which contains the latest file references, and remove the one on which your boot is hanging.

oleg-schmidt avatar Nov 29 '23 20:11 oleg-schmidt