GPTCache icon indicating copy to clipboard operation
GPTCache copied to clipboard

[Bug]: Trying to auto install packages during runtime is not security friendly

Open kmehant opened this issue 2 years ago • 4 comments

Current Behavior

GPTCache tries to check if the intended set of python modules exists in the host environment if not it tries to auto install them during runtime.

Expected Behavior

GPTCache should look for an alternative non-runtime based approach which is much security friendly or may be provide an option to toggle this off for downstream packages such as guidance and many others.

In production environments, it is typical that the the environment is hardened like keeping the filesystem read-only etc. As GPTCache tries to install packages during runtime this might break the systems as they dont allow such operations.

Steps To Reproduce

1. Use any downstream package that uses GPTCache such as [guidance tool](https://github.com/guidance-ai/guidance)
2. Observe the logs that it tries to install missing packages

start to install package: redis_om
successfully installed package: redis_om
redis_om installed successfully!


### Environment

_No response_

### Anything else?

_No response_

kmehant avatar Aug 30 '23 19:08 kmehant

Thanks for the great useful project, looking forward to a resolution for this.

kmehant avatar Aug 30 '23 19:08 kmehant

I'm also running into problems where gptcache tries to install dependencies at runtime. I'd very much like to avoid this on production. It delays the startup of the application and risks the installation (and thus the application as a whole) failing. We're not using Redis, but it still tries to install the redis package upon importing guidance (which uses gptcache). the installation of redis also fails on some of the development machines.

This is also quite confusing for users who are trying guidance using the python interpreter and running into this issue:

>>> import guidance
start to install package: redis

Note that the installation already happens when importing gptcache.utils, so this isn't just a guidance issue:

$ python
Python 3.11.4 (main, Jun  6 2023, 22:16:46) [GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import gptcache.utils
start to install package: redis
successfully installed package: redis
start to install package: redis_om
successfully installed package: redis_om
>>>

There are a number of issues related to failure of installing the dependencies at runtime:

  • https://github.com/zilliztech/GPTCache/issues/442
  • https://github.com/zilliztech/GPTCache/issues/521

Preferably the optional dependencies would be specified as such. poetry has good support for this: https://python-poetry.org/docs/pyproject/#extras

I have no experience doing the same with requirements.txt, but it seems there is a standard for doing so:

https://peps.python.org/pep-0508/#extras

If I interpret that correctly it should be possible to specify:

redis[redis]
redis_om[redis]

So that people should be able to install gptcache with those optional dependencies using pip install gptcache[redis].

Would that be a good alternative?

bobvanderlinden avatar Sep 14 '23 08:09 bobvanderlinden

i will checkout it, it's a bad case

$ python
Python 3.11.4 (main, Jun  6 2023, 22:16:46) [GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import gptcache.utils
start to install package: redis
successfully installed package: redis
start to install package: redis_om
successfully installed package: redis_om
>>>

SimFG avatar Sep 14 '23 13:09 SimFG

Thirding this issue, it is a nasty surprise. We saw this behavior during the run of unit tests, which is absolutely the wrong place for a pip install, under any circumstances. The project should rely on setup.py to advertise its dependencies and let pip install, or alternatives, do their jobs, and runtime behavior should be just to bubble up the ImportErrors rather than trying to fix the problem.

aawilson avatar Dec 19 '23 18:12 aawilson