gpt4-pdf-chatbot-langchain icon indicating copy to clipboard operation
gpt4-pdf-chatbot-langchain copied to clipboard

Racking My Brain - Failing To Ingest

Open rancor13 opened this issue 1 year ago • 26 comments

Have read the documentation over, and over, and over... Have created new API keys on OpenAI / Pinecone. Have double, and triple checked the name space, index name, and environment are 100% correct. image


** Visual Studio 2022 Developer PowerShell v17.5.3 ** Copyright (c) 2022 Microsoft Corporation


PS C:\Users\Randall Cornett\Source\Repos\mayooear\gpt4-pdf-chatbot-langchain> url: 'https://api.openai.com/v1/embeddings'

},
request: Request {
  [Symbol(realm)]: [Object],
  [Symbol(state)]: [Object],
  [Symbol(signal)]: [AbortSignal],
  [Symbol(headers)]: [HeadersList]
},
data: { error: [Object] }

}, isAxiosError: true, toJSON: [Function: toJSON] } c:\Users\Randall Cornett\source\repos\mayooear\gpt4-pdf-chatbot-langchain\scripts\ingest-data.ts:49 throw new Error('Failed to ingest your data'); ^

[Error: Failed to ingest your data]

Anything to help me fix this would be greatly appreciated. Spent like 6 hours last night and made like no progress.

rancor13 avatar Mar 28 '23 18:03 rancor13

are you using gpt-4?

p-toni avatar Mar 29 '23 01:03 p-toni

Have used them both (I have GPT-4).. still no avail. It splits the documents, etc etc. Everything looks fine up until the very send when creating the vector store then bam.. Failed to ingest the data.

Have tried 3x different open AI API keys, Multiple (paid for pine cone) environments, have watched the video 100x. Updaded any sort of plugin / update that VM studio told me in terms of code. It's getting hung up on the last part. Frustrating.

rancor13 avatar Mar 29 '23 03:03 rancor13

image

rancor13 avatar Mar 29 '23 03:03 rancor13

Ran into same issue. Fixed it by making sure to initialize an index in Pinecone Dashboard and making sure the namespace in pinecone.ts match the one in the dashboard (you can find it under Index Info). You can find the namespace under config/pinecone.ts

const PINECONE_NAME_SPACE = <make sure it is same as dashboard>

Edit: Initialize index and confirm its 'Ready' before running ingest script.

axr6077 avatar Mar 29 '23 03:03 axr6077

Really appreciate the response.. but I have no name space... in Pinecone. image

rancor13 avatar Mar 29 '23 05:03 rancor13

CHATGPT says I need to ingest data into it first.. but how can I ingest if it always fails? Thank you for your help.

rancor13 avatar Mar 29 '23 05:03 rancor13

The latest error you sent has a status code 429 ~~ pinecone complaining about receiving too many requests. Could you try with a single doc to discard this being the issue?

pablosr11 avatar Mar 29 '23 05:03 pablosr11

Yep, using the default PSD provided with the clone just for testing sake. So only 1 pdf.

rancor13 avatar Mar 29 '23 05:03 rancor13

Ran into same issue. Fixed it by making sure to initialize an index in Pinecone Dashboard and making sure the namespace in pinecone.ts match the one in the dashboard (you can find it under Index Info). You can find the namespace under config/pinecone.ts

const PINECONE_NAME_SPACE = <make sure it is same as dashboard>

Edit: Initialize index and confirm its 'Ready' before running ingest script.

Managed to get an index through like the query portion and edit of Pinecones website. Here is where I'm at now.

image

with that on pinecone as well... image

Still failing to ingest.

rancor13 avatar Mar 29 '23 05:03 rancor13

Line 9 references amcbot2 instead of pinecone_index_name. Is that it?

pablosr11 avatar Mar 29 '23 06:03 pablosr11

Nope, fixed it I was re reading my post :( image

rancor13 avatar Mar 29 '23 06:03 rancor13

I faced the same problem yesterday...try one thing: delete your pinecone project and create another one, not the index, the whole project. When creating a new one, choose another environment and create a new index.

p-toni avatar Mar 29 '23 13:03 p-toni

I hit exactly the same error. still stuck with 429 too many requests error. tried only one pdf doc but still had the error. how did you manage to overcome this error @rancor13 ?

ttu-nguyen avatar Mar 29 '23 13:03 ttu-nguyen

Hey @rancor13, I just fixed this same error. The error is due to the multiple OPENAI API keys in the environment. apparently the env variable in env file from the project gets overwritten by systems env variable. So just check your Systems Environment variable and delete it or change it to the correct API KEY. To check, you can use the following commands

Windows:

echo %OPENAI_API_KEY%

Linux:

echo OPENAI_API_KEY

Mac:

printenv | grep OPENAI_API_KEY

hmzakhalid avatar Mar 29 '23 13:03 hmzakhalid

I am also getting the error:

creating vector store...
error [Error: PineconeClient: Error calling upsert: Error: PineconeClient: Error calling upsertRaw: FetchError: The request failed and the interceptors did not return an alternative response]
/Users/lionardo/Documents/Sites/gpt4-pdf-chatbot-langchain/scripts/ingest-data.ts:44
    throw new Error('Failed to ingest your data');
          ^


[Error: Failed to ingest your data]

I have the Namespace set and I am loading only 1 pdf.

Update: I got it to work, actually. It was not initialized yet. The name space can be set if you make a curl command and set the namespace value.

Lionardo avatar Mar 29 '23 14:03 Lionardo

Hey @rancor13, I just fixed this same error. The error is due to the multiple OPENAI API keys in the environment. apparently the env variable in env file from the project gets overwritten by systems env variable. So just check your Systems Environment variable and delete it or change it to the correct API KEY. To check, you can use the following commands

Windows:

echo %OPENAI_API_KEY%

Linux:

echo OPENAI_API_KEY

Mac:

printenv | grep OPENAI_API_KEY

Was really excited this would work.. but nope. When I run the windows command in terminal it just spits back the correct API key I'm supposed to be using... so that's confusing. Dang.. thought this would work..

Have now created a whole new project with new index and new, different environment etc as suggested. Still the same exact error. Failed to ingest.. to many requests : /

rancor13 avatar Mar 29 '23 18:03 rancor13

I hit exactly the same error. still stuck with 429 too many requests error. tried only one pdf doc but still had the error. how did you manage to overcome this error @rancor13 ?

Haven't. Tried every suggestion listed in the thread.

rancor13 avatar Mar 29 '23 18:03 rancor13

Hey @rancor13, I just fixed this same error. The error is due to the multiple OPENAI API keys in the environment. apparently the env variable in env file from the project gets overwritten by systems env variable. So just check your Systems Environment variable and delete it or change it to the correct API KEY. To check, you can use the following commands

Windows:

echo %OPENAI_API_KEY%

Linux:

echo OPENAI_API_KEY

Mac:

printenv | grep OPENAI_API_KEY

Was really excited this would work.. but nope. When I run the windows command in terminal it just spits back the correct API key I'm supposed to be using... so that's confusing. Dang.. thought this would work..

Have now created a whole new project with new index and new, different environment etc as suggested. Still the same exact error. Failed to ingest.. to many requests : /

Then have you checked the error code? The too many requests. Is it from openai or pinecone? The solution I provided was for openai 401 error. Which means the api key is incorrect. If you can provide the complete output of the error you get. It would help me understand and I can maybe help you from there.

hmzakhalid avatar Mar 29 '23 22:03 hmzakhalid

Sure the cause of my error is less common than for most as I'm new to this, but I did not have langchain installed. One to try pip install langchain

focusai avatar Mar 29 '23 23:03 focusai

same error... is it from openai or pinecone...

i have checked pinecone PINECONE_NAME_SPACE and PINECONE_INDEX_NAME image

linwentao-hexin avatar Mar 31 '23 03:03 linwentao-hexin

hey @rancor13 , are you using a proxy like clash?

LvisWang avatar Mar 31 '23 08:03 LvisWang

hey @rancor13 , are you using a proxy like clash?

I meet the same error and I also sue proxy as clash ,but it also can not work correctly

15392778677 avatar Mar 31 '23 09:03 15392778677

@rancor13, I encountered the same error and discovered that it was caused by my OpenAI account. Using the API requires payment for each request. In order to resolve the issue, I added a credit card to my account and now it works.

anaszil avatar Apr 02 '23 00:04 anaszil

@rancor13 I came here just to say the exact same thing as @anaszil. I kept getting a 429 response no matter what, and after messing around with a separate Python library, I realized that the 429 was coming from OpenAI, and adding a payment method to my account immediately resolved the issue - I'm able to ingest data now and get responses from the chat.

discolando avatar Apr 02 '23 06:04 discolando

If you experience a 429 error, just create a new user for chatGPT (another google account for exemple, a free account, no need to pay). Generate a new API key. Replace your old one. Solved my problem instantly.

creating vector store... ingestion complete

obaeyaert avatar Apr 02 '23 12:04 obaeyaert

Wow, thanks to all of you for going through this. Following this instantly solved my errors.

TheBossDragon avatar Apr 02 '23 17:04 TheBossDragon

Hi, @rancor13! I'm Dosu, and I'm here to help the gpt4-pdf-chatbot-langchain team manage their backlog. I wanted to let you know that we are marking this issue as stale.

Based on my understanding, you are experiencing difficulties with data ingestion despite following the documentation and creating new API keys. There have been several suggestions from other users to resolve the issue, such as initializing an index in Pinecone Dashboard, ensuring the namespace in pinecone.ts matches the one in the dashboard, and checking for multiple OPENAI API keys in the environment. Some users have also suggested creating a new project or user for ChatGPT and adding a payment method to the OpenAI account. However, it seems that the issue is still ongoing and has not been resolved.

Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the gpt4-pdf-chatbot-langchain repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself, or the issue will be automatically closed in 7 days.

Thank you for your understanding and cooperation. Let us know if there's anything else we can assist you with!

dosubot[bot] avatar Sep 24 '23 16:09 dosubot[bot]