litellm icon indicating copy to clipboard operation
litellm copied to clipboard

Docker Database connection Issue

Open Raju-git20 opened this issue 1 year ago • 5 comments

When I try to deploy the latest version of Litellm LLm with a database. I'm receiving the following issue despite the database being connected to the Supabase platform and linked using the database url. Could someone please help me with the issue?

Sample Database Url used: postgresql://postgres:[email protected]:5432/postgres

Error: /usr/local/lib/python3.13/site-packages/pydantic/_internal/_config.py:345: UserWarning: Valid config keys have changed in V2:

  • 'fields' has been removed warnings.warn(message, UserWarning) Traceback (most recent call last): File "/usr/local/lib/python3.13/pathlib/_local.py", line 722, in mkdir os.mkdir(self, mode)

FileNotFoundError: [Errno 2] No such file or directory: '/.cache/prisma-python/binaries/5.4.2/ac9d7041ed77bcc8a8dbd2ab6616b39013829574'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/usr/local/lib/python3.13/pathlib/_local.py", line 722, in mkdir os.mkdir(self, mode) ~~~~~~~~^^^^^^^^^^^^ FileNotFoundError: [Errno 2] No such file or directory: '/.cache/prisma-python/binaries/5.4.2'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/usr/local/lib/python3.13/pathlib/_local.py", line 722, in mkdir os.mkdir(self, mode) ~~~~~~~~^^^^^^^^^^^^ FileNotFoundError: [Errno 2] No such file or directory: '/.cache/prisma-python/binaries'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/usr/local/lib/python3.13/pathlib/_local.py", line 722, in mkdir os.mkdir(self, mode) ~~~~~~~~^^^^^^^^^^^^ FileNotFoundError: [Errno 2] No such file or directory: '/.cache/prisma-python'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/usr/local/bin/prisma", line 8, in sys.exit(main()) ~~~~^^ File "/usr/local/lib/python3.13/site-packages/prisma/cli/cli.py", line 39, in main sys.exit(prisma.run(args[1:])) ~~~~~~~~~~^^^^^^^^^^ File "/usr/local/lib/python3.13/site-packages/prisma/cli/prisma.py", line 36, in run entrypoint = ensure_cached().entrypoint ~~~~~~~~~~~~~^^ File "/usr/local/lib/python3.13/site-packages/prisma/cli/prisma.py", line 78, in ensure_cached cache_dir.mkdir(parents=True) ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^ File "/usr/local/lib/python3.13/pathlib/_local.py", line 726, in mkdir self.parent.mkdir(parents=True, exist_ok=True) ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.13/pathlib/_local.py", line 726, in mkdir self.parent.mkdir(parents=True, exist_ok=True) ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.13/pathlib/_local.py", line 726, in mkdir self.parent.mkdir(parents=True, exist_ok=True) ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [Previous line repeated 1 more time] File "/usr/local/lib/python3.13/pathlib/_local.py", line 722, in mkdir os.mkdir(self, mode) ~~~~~~~~^^^^^^^^^^^^ PermissionError: [Errno 13] Permission denied: '/.cache' INFO: Started server process [1] INFO: Waiting for application startup. ERROR: Traceback (most recent call last): File "/usr/local/lib/python3.13/site-packages/starlette/routing.py", line 732, in lifespan async with self.lifespan_context(app) as maybe_state: ~~~~~~~~~~~~~~~~~~~~~^^^^^ File "/usr/local/lib/python3.13/site-packages/starlette/routing.py", line 608, in aenter await self._router.startup() File "/usr/local/lib/python3.13/site-packages/starlette/routing.py", line 709, in startup await handler() File "/usr/local/lib/python3.13/site-packages/litellm/proxy/proxy_server.py", line 3014, in startup_event prisma_client = await ProxyStartupEvent._setup_prisma_client( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ...<3 lines>... ) ^ File "/usr/local/lib/python3.13/site-packages/litellm/proxy/proxy_server.py", line 2997, in _setup_prisma_client await prisma_client.health_check() File "/usr/local/lib/python3.13/site-packages/litellm/proxy/utils.py", line 2176, in health_check raise e File "/usr/local/lib/python3.13/site-packages/litellm/proxy/utils.py", line 2158, in health_check response = await self.db.query_raw(sql_query) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.13/site-packages/prisma/client.py", line 424, in query_raw resp = await self._execute( ^^^^^^^^^^^^^^^^^^^^ ...<6 lines>... ) ^ File "/usr/local/lib/python3.13/site-packages/prisma/client.py", line 528, in _execute return await self._engine.query(builder.build(), tx_id=self._tx_id) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.13/site-packages/prisma/engine/query.py", line 244, in query return await self.request( ^^^^^^^^^^^^^^^^^^^ ...<4 lines>... ) ^ File "/usr/local/lib/python3.13/site-packages/prisma/engine/http.py", line 97, in request raise errors.NotConnectedError('Not connected to the query engine') prisma.engine.errors.NotConnectedError: Not connected to the query engine

08:45:15 - LiteLLM Proxy:ERROR: utils.py:2188 - Error getting LiteLLM_SpendLogs row count: Not connected to the query engine ERROR: Application startup failed. Exiting.

Raju-git20 avatar Dec 28 '24 08:12 Raju-git20

I had the same problem I had to turn require_secure_transport config off in Azure PostgresSQL but that resulted in another error seems that the container can't connect to the database

jruokola avatar Jan 23 '25 09:01 jruokola

I am also getting the same issue, I have deployed it in k8s with helm. @ishaan-jaff Can you please help us here?

    raise errors.NotConnectedError('Not connected to the query engine')
prisma.engine.errors.NotConnectedError: Not connected to the query engine

17:54:47 - LiteLLM Proxy:ERROR: utils.py:2205 - Error getting LiteLLM_SpendLogs row count: Not connected to the query engine
ERROR:    Application startup failed. Exiting.```


I have disabled the require_secure_transport in Azure Postgresql DB but still doesn't work.

Also, I have tried adding ?sslmode=require to the DATABASE URL.

chetankapoor avatar Jan 28 '25 18:01 chetankapoor

If you're running docker test with the new litellm-database image

jruokola avatar Feb 06 '25 15:02 jruokola

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

github-actions[bot] avatar May 08 '25 00:05 github-actions[bot]

This problem seems unsolved still, as deploying litellm docker on local postgres on my macbook failed also with Error getting LiteLLM_SpendLogs row count: All connection attempts failed. The command used is docker run -e LITELLM_MASTER_KEY="sk-xxxx" -e LITELLM_SALT_KEY="sk-xxxx" -e DATABASE_URL="postgresql://yyy:[email protected]:5432/litellm" ghcr.io/berriai/litellm-database:main-v1.68.0-stable. The same problem is also tested with litellm:main-v1.68.0 & main-v1.69.0

soyosaki avatar May 13 '25 03:05 soyosaki

hit this issue as well today

elabbarw avatar May 14 '25 15:05 elabbarw

Upgrading to the latest stable version solved the issue for me...

elabbarw avatar May 14 '25 16:05 elabbarw

I am still seeing this.

main-v1.74.7-stable works. But anything newer hits the DB error:

LiteLLM Proxy:ERROR: utils.py:2188 - Error getting LiteLLM_SpendLogs row count: Not connected to the query engine

icsy7867 avatar Aug 04 '25 15:08 icsy7867

I am still seeing this.

main-v1.74.7-stable works. But anything newer hits the DB error:

LiteLLM Proxy:ERROR: utils.py:2188 - Error getting LiteLLM_SpendLogs row count: Not connected to the query engine

same, i'm using Dockerfile.non_root and seems it changed to alpine version after 1.74.7

ZyairYH avatar Aug 06 '25 06:08 ZyairYH

same error after upgrading to the latest Version -AWS RDS postgresql -non_root Version

DanielQ-CV avatar Aug 06 '25 10:08 DanielQ-CV

Any update on this issue? We're still getting the error

javiergarciapleo avatar Aug 06 '25 13:08 javiergarciapleo

I actually got this to run:

https://github.com/berriai/litellm/pkgs/container/litellm-non_root/479377004?tag=main-v1.75.0-nightly

Though, I had to set an ENV flag for setting up a temp directory on my FIPS enabled host.

icsy7867 avatar Aug 06 '25 13:08 icsy7867

@icsy7867 can you give me more details about what ENV flag you enable?

javiergarciapleo avatar Aug 06 '25 13:08 javiergarciapleo

@icsy7867 can you give me more details about what ENV flag you enable?

Sure! So when I deployed 1.75.0-nightly, I received a different error. I dont recall the exact error, but something about the prisma database migrating or updating and a permission denied error. I found a github issue with some content.

In my kubernetes deployment I mounted a directory to /temp and then I added this env var:

LITELLM_MIGRATION_DIR:/temp

And everything seemed to fire off correctly.

icsy7867 avatar Aug 06 '25 13:08 icsy7867

@icsy7867 can you give me more details about what ENV flag you enable?

Sure! So when I deployed 1.75.0-nightly, I received a different error. I dont recall the exact error, but something about the prisma database migrating or updating and a permission denied error. I found a github issue with some content.

In my kubernetes deployment I mounted a directory to /temp and then I added this env var:

LITELLM_MIGRATION_DIR:/temp

And everything seemed to fire off correctly.

I tested a branch of litellm images only ghcr.io/berriai/litellm-non_root:main-v1.75.0-nightly works on openshift.

Elvincth avatar Oct 11 '25 01:10 Elvincth

I ran into this issue as well. We're running in a GCP GKE autopilot cluster using our own helm chart.

Originally, we thought the DATABASE_URL was not getting set properly, but we eliminated. We also saw that the initial migrations would apply successfully with a message like:

24 migrations found in prisma/migrations


No pending migrations to apply.

2025-06-18 17:58:26,928 - litellm_proxy_extras - INFO - prisma migrate deploy completed

We applied this fix:

Added Prisma cache environment variables pointing to /tmp/prisma-cache:                                                                                                
    - PRISMA_HOME                                                                                                                                                           
    - PRISMA_BINARY_CACHE_DIR                                                                                                                                               
    - PRISMA_CLI_CACHE_DIR                                                                                                                                                  
    - XDG_CACHE_HOME                                                                                                                                                        
Updated NPM cache to /tmp/npm-cache       

I think the root issue and final fix was noted in this issue.

Once we switched to the non_root, the app was able to successfully connect to the DB (postgres running in our cluster).

namabile avatar Oct 15 '25 20:10 namabile

I ran into this issue as well. We're running in a GCP GKE autopilot cluster using our own helm chart.

Originally, we thought the DATABASE_URL was not getting set properly, but we eliminated. We also saw that the initial migrations would apply successfully with a message like:

24 migrations found in prisma/migrations


No pending migrations to apply.

2025-06-18 17:58:26,928 - litellm_proxy_extras - INFO - prisma migrate deploy completed

We applied this fix:

Added Prisma cache environment variables pointing to /tmp/prisma-cache:                                                                                                
    - PRISMA_HOME                                                                                                                                                           
    - PRISMA_BINARY_CACHE_DIR                                                                                                                                               
    - PRISMA_CLI_CACHE_DIR                                                                                                                                                  
    - XDG_CACHE_HOME                                                                                                                                                        
Updated NPM cache to /tmp/npm-cache       

I think the root issue and final fix was noted in this issue.

Once we switched to the non_root, the app was able to successfully connect to the DB (postgres running in our cluster).

Can you access the public model hub page after deploy?

Elvincth avatar Oct 16 '25 00:10 Elvincth

Can you access the public model hub page after deploy?

I can't access /ui/model_hub_table (linked on the homepage). But I can access ui/?page=model-hub-table (linked from the /ui/ sidebar). I assume it's because I haven't made the /ui/model_hub_table public.

The web UI is behind an IAP.

Is that what you mean? I'm @nick_90066 in the LiteLLM discord if you want to chat.

namabile avatar Oct 16 '25 01:10 namabile

Can you access the public model hub page after deploy?

I can't access /ui/model_hub_table (linked on the homepage). But I can access ui/?page=model-hub-table (linked from the /ui/ sidebar). I assume it's because I haven't made the /ui/model_hub_table public.

The web UI is behind an IAP.

Is that what you mean? I'm @nick_90066 in the LiteLLM discord if you want to chat.

As we encounter https://github.com/BerriAI/litellm/issues/15494#issuecomment-3408720176 issue after deployed to openshift, even if we made the models in ui/?page=model-hub-table public, not sure if it is related to DB migration or due to the non-root image itself.

Elvincth avatar Oct 16 '25 01:10 Elvincth

@Elvincth -- I tired making my model hub page public and I'm not able to access it. I get a 500 error and the logs show errors decrypting DB secrets. I think those errors are mislead though. I've seen them before and found the root cause to be someting else for an unrelated error.

namabile avatar Oct 16 '25 15:10 namabile

@Elvincth -- I tired making my model hub page public and I'm not able to access it. I get a 500 error and the logs show errors decrypting DB secrets. I think those errors are mislead though. I've seen them before and found the root cause to be someting else for an unrelated error.

Did you manage to get the mdoel hub run eventually?

Elvincth avatar Oct 17 '25 01:10 Elvincth

No. It’s not part of my requirements.I'll take a look a to see if I can find the bug and open an issue if there's not already one open.

On Thu, Oct 16, 2025 at 21:23 Elvin Chu @.***> wrote:

Elvincth left a comment (BerriAI/litellm#7450) https://github.com/BerriAI/litellm/issues/7450#issuecomment-3413439737

@Elvincth https://github.com/Elvincth -- I tired making my model hub page public and I'm not able to access it. I get a 500 error and the logs show errors decrypting DB secrets. I think those errors are mislead though. I've seen them before and found the root cause to be someting else for an unrelated error.

Did you manage to get the mdoel hub run eventually?

— Reply to this email directly, view it on GitHub https://github.com/BerriAI/litellm/issues/7450#issuecomment-3413439737, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKS556AE2AVFP6RCHH5R2T3YBAIRAVCNFSM6AAAAABUJTXTBSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTIMJTGQZTSNZTG4 . You are receiving this because you commented.Message ID: @.***>

namabile avatar Oct 18 '25 01:10 namabile

No. It’s not part of my requirements.I'll take a look a to see if I can

find the bug and open an issue if there's not already one open.

On Thu, Oct 16, 2025 at 21:23 Elvin Chu @.***> wrote:

Elvincth left a comment (BerriAI/litellm#7450)

https://github.com/BerriAI/litellm/issues/7450#issuecomment-3413439737

@Elvincth https://github.com/Elvincth -- I tired making my model hub

page public and I'm not able to access it. I get a 500 error and the logs

show errors decrypting DB secrets. I think those errors are mislead though.

I've seen them before and found the root cause to be someting else for an

unrelated error.

Did you manage to get the mdoel hub run eventually?

Reply to this email directly, view it on GitHub

https://github.com/BerriAI/litellm/issues/7450#issuecomment-3413439737,

or unsubscribe

https://github.com/notifications/unsubscribe-auth/AAKS556AE2AVFP6RCHH5R2T3YBAIRAVCNFSM6AAAAABUJTXTBSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTIMJTGQZTSNZTG4

.

You are receiving this because you commented.Message ID:

@.***>

Thanks a lot for your help.

Elvincth avatar Oct 18 '25 01:10 Elvincth