gpt4-pdf-chatbot-langchain
gpt4-pdf-chatbot-langchain copied to clipboard
Racking My Brain - Failing To Ingest
Have read the documentation over, and over, and over... Have created new API keys on OpenAI / Pinecone. Have double, and triple checked the name space, index name, and environment are 100% correct.
** Visual Studio 2022 Developer PowerShell v17.5.3 ** Copyright (c) 2022 Microsoft Corporation
PS C:\Users\Randall Cornett\Source\Repos\mayooear\gpt4-pdf-chatbot-langchain> url: 'https://api.openai.com/v1/embeddings'
}, request: Request { [Symbol(realm)]: [Object], [Symbol(state)]: [Object], [Symbol(signal)]: [AbortSignal], [Symbol(headers)]: [HeadersList] }, data: { error: [Object] }
}, isAxiosError: true, toJSON: [Function: toJSON] } c:\Users\Randall Cornett\source\repos\mayooear\gpt4-pdf-chatbot-langchain\scripts\ingest-data.ts:49 throw new Error('Failed to ingest your data'); ^
[Error: Failed to ingest your data]
Anything to help me fix this would be greatly appreciated. Spent like 6 hours last night and made like no progress.
are you using gpt-4?
Have used them both (I have GPT-4).. still no avail. It splits the documents, etc etc. Everything looks fine up until the very send when creating the vector store then bam.. Failed to ingest the data.
Have tried 3x different open AI API keys, Multiple (paid for pine cone) environments, have watched the video 100x. Updaded any sort of plugin / update that VM studio told me in terms of code. It's getting hung up on the last part. Frustrating.
Ran into same issue. Fixed it by making sure to initialize an index in Pinecone Dashboard and making sure the namespace in pinecone.ts match the one in the dashboard (you can find it under Index Info). You can find the namespace under config/pinecone.ts
const PINECONE_NAME_SPACE = <make sure it is same as dashboard>
Edit: Initialize index and confirm its 'Ready' before running ingest script.
Really appreciate the response.. but I have no name space... in Pinecone.
CHATGPT says I need to ingest data into it first.. but how can I ingest if it always fails? Thank you for your help.
The latest error you sent has a status code 429 ~~ pinecone complaining about receiving too many requests. Could you try with a single doc to discard this being the issue?
Yep, using the default PSD provided with the clone just for testing sake. So only 1 pdf.
Ran into same issue. Fixed it by making sure to initialize an index in Pinecone Dashboard and making sure the namespace in pinecone.ts match the one in the dashboard (you can find it under Index Info). You can find the namespace under config/pinecone.ts
const PINECONE_NAME_SPACE = <make sure it is same as dashboard>
Edit: Initialize index and confirm its 'Ready' before running ingest script.
Managed to get an index through like the query portion and edit of Pinecones website. Here is where I'm at now.
with that on pinecone as well...
Still failing to ingest.
Line 9 references amcbot2 instead of pinecone_index_name. Is that it?
Nope, fixed it I was re reading my post :(
I faced the same problem yesterday...try one thing: delete your pinecone project and create another one, not the index, the whole project. When creating a new one, choose another environment and create a new index.
I hit exactly the same error. still stuck with 429 too many requests error. tried only one pdf doc but still had the error. how did you manage to overcome this error @rancor13 ?
Hey @rancor13, I just fixed this same error. The error is due to the multiple OPENAI API keys in the environment.
apparently the env variable in env
file from the project gets overwritten by systems env variable.
So just check your Systems Environment variable and delete it or change it to the correct API KEY.
To check, you can use the following commands
Windows:
echo %OPENAI_API_KEY%
Linux:
echo OPENAI_API_KEY
Mac:
printenv | grep OPENAI_API_KEY
I am also getting the error:
creating vector store...
error [Error: PineconeClient: Error calling upsert: Error: PineconeClient: Error calling upsertRaw: FetchError: The request failed and the interceptors did not return an alternative response]
/Users/lionardo/Documents/Sites/gpt4-pdf-chatbot-langchain/scripts/ingest-data.ts:44
throw new Error('Failed to ingest your data');
^
[Error: Failed to ingest your data]
I have the Namespace set and I am loading only 1 pdf.
Update: I got it to work, actually. It was not initialized yet. The name space can be set if you make a curl command and set the namespace value.
Hey @rancor13, I just fixed this same error. The error is due to the multiple OPENAI API keys in the environment. apparently the env variable in
env
file from the project gets overwritten by systems env variable. So just check your Systems Environment variable and delete it or change it to the correct API KEY. To check, you can use the following commandsWindows:
echo %OPENAI_API_KEY%
Linux:
echo OPENAI_API_KEY
Mac:
printenv | grep OPENAI_API_KEY
Was really excited this would work.. but nope. When I run the windows command in terminal it just spits back the correct API key I'm supposed to be using... so that's confusing. Dang.. thought this would work..
Have now created a whole new project with new index and new, different environment etc as suggested. Still the same exact error. Failed to ingest.. to many requests : /
I hit exactly the same error. still stuck with 429 too many requests error. tried only one pdf doc but still had the error. how did you manage to overcome this error @rancor13 ?
Haven't. Tried every suggestion listed in the thread.
Hey @rancor13, I just fixed this same error. The error is due to the multiple OPENAI API keys in the environment. apparently the env variable in
env
file from the project gets overwritten by systems env variable. So just check your Systems Environment variable and delete it or change it to the correct API KEY. To check, you can use the following commandsWindows:
echo %OPENAI_API_KEY%
Linux:
echo OPENAI_API_KEY
Mac:
printenv | grep OPENAI_API_KEY
Was really excited this would work.. but nope. When I run the windows command in terminal it just spits back the correct API key I'm supposed to be using... so that's confusing. Dang.. thought this would work..
Have now created a whole new project with new index and new, different environment etc as suggested. Still the same exact error. Failed to ingest.. to many requests : /
Then have you checked the error code? The too many requests. Is it from openai or pinecone? The solution I provided was for openai 401 error. Which means the api key is incorrect. If you can provide the complete output of the error you get. It would help me understand and I can maybe help you from there.
Sure the cause of my error is less common than for most as I'm new to this, but I did not have langchain installed. One to try pip install langchain
same error... is it from openai or pinecone...
i have checked pinecone PINECONE_NAME_SPACE and PINECONE_INDEX_NAME
hey @rancor13 , are you using a proxy like clash?
hey @rancor13 , are you using a proxy like clash?
I meet the same error and I also sue proxy as clash ,but it also can not work correctly
@rancor13, I encountered the same error and discovered that it was caused by my OpenAI account. Using the API requires payment for each request. In order to resolve the issue, I added a credit card to my account and now it works.
@rancor13 I came here just to say the exact same thing as @anaszil. I kept getting a 429 response no matter what, and after messing around with a separate Python library, I realized that the 429 was coming from OpenAI, and adding a payment method to my account immediately resolved the issue - I'm able to ingest data now and get responses from the chat.
If you experience a 429 error, just create a new user for chatGPT (another google account for exemple, a free account, no need to pay). Generate a new API key. Replace your old one. Solved my problem instantly.
creating vector store... ingestion complete
Wow, thanks to all of you for going through this. Following this instantly solved my errors.
Hi, @rancor13! I'm Dosu, and I'm here to help the gpt4-pdf-chatbot-langchain team manage their backlog. I wanted to let you know that we are marking this issue as stale.
Based on my understanding, you are experiencing difficulties with data ingestion despite following the documentation and creating new API keys. There have been several suggestions from other users to resolve the issue, such as initializing an index in Pinecone Dashboard, ensuring the namespace in pinecone.ts matches the one in the dashboard, and checking for multiple OPENAI API keys in the environment. Some users have also suggested creating a new project or user for ChatGPT and adding a payment method to the OpenAI account. However, it seems that the issue is still ongoing and has not been resolved.
Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the gpt4-pdf-chatbot-langchain repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself, or the issue will be automatically closed in 7 days.
Thank you for your understanding and cooperation. Let us know if there's anything else we can assist you with!