Treat empty strings as invalid tokens
Verba's docker_compose.yml file maps session environment variables to the container using a pattern like COHERE_API_KEY=$COHERE_API_KEY.
When os.environ.get("COHERE_API_KEY", None) was called inside the container and the environment variable was not defined, an empty string (instead of None) was returned. Since empty strings weren't handled correctly, the Docker version of Verba failed to run without valid keys.
In CohereEmbedder and CohereGenerator the get_models() function returned None, causing a TypeError: 'NoneType' object is not subscriptable (see #290).
For OpenAIEmbedder and OpenAIGenerator, no error was raised initially, but retrieval of current models was attempted without a valid token and triggered a Failed to fetch OpenAI embedding models: 401 Client Error: Unauthorized for url: https://api.openai.com/v1/models .
This PR introduces a solution for both issues: a new utility function get_token(), which can replace os.environ.get() directly and handles empty strings. Later, it could be extended to a stricter, e.g. Regex-based validation, for valid tokens.
PS: I aimed to follow the contributing guidelines, but the dev branch appears to contain version 1.0, so this pull request is directed to main. Running Black would reformat several files I haven't touched, so I skipped it. Additionally, I couldn't find tests for the Python parts. Apologies for any inconvenience.
To avoid any confusion in the future about your contribution to Weaviate, we work with a Contributor License Agreement. If you agree, you can simply add a comment to this PR that you agree with the CLA so that we can merge.
beep boop - the Weaviate bot 👋🤖
PS:
Are you already a member of the Weaviate Slack channel?
Yes, I hereby agree with the Contributor License Agreement of Weaviate.
Great catch! Thanks a lot for the PR