chatgpt-retrieval
chatgpt-retrieval copied to clipboard
Use .env files and import pypdf
Added pypdf as requirement to read pdf Created a requirements.txt file Use .env to avoid commit OpenAI keys
- Is pypdf necessary, or does unstructured include that in its dependencies?
- The dependencies in requirements.txt (wrapt, ipython, pythondotenv) don't seem to match the pip install command...?
- typo on "You can also use" to "You can use"
- pypdf is necessary. Without manually using was not reading pdf correctly.
- Sorry, I updated the command. iphython is not necessary. wrapt is necessary to enforce the 1.15. The 1.14 doesn't support the latest python version. pythondotenv is for the .env support
- Typo fixed
I like the idea adding the missing requirements.txt to the project, many of us misses it.
My comments here are similar to https://github.com/techleadhd/chatgpt-retrieval/pull/31.
This might be a matter of preference, but to keep things simple, I'm favoring keeping the api key in the code, since setting up env vars is one more thing to learn for new folks. Not using the env vars also makes the code more compatible with web apps, which may not be running under the user's environment. Let me know if there is a strong reason otherwise.
Should we create another PR for the missing dependencies?
i was able to run this without pythondotenve or wrapt... are you sure those are needed...?
No, for me pypdf
was necessary, but I'd need to re-test it.
No, for me
pypdf
was necessary, but I'd need to re-test it.
✗ python chatgpt.py "what is my cat's name"
Using embedded DuckDB without persistence: data will be transient
I'm sorry, but I don't know what your cat's name is.
This is the result after a fresh cloning from GitHub and using venv with the installation command in the README:
pip install langchain openai chromadb tiktoken unstructured
But my assumption was wrong, installing the pypdf
does not solve the problem. It's related to the DirectoryLoader, which "uses the UnstructuredLoader" under the hood, but directly using this works like a charm. So the underlying implementation has changed without updating the documentation.