dify IO Error After Upgrade

Self Checks

[X] This is only for bug report, if you would like to ask a quesion, please head to Discussions.
[X] I have searched for existing issues search for existing issues, including closed ones.
[X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
[X] Pleas do not modify this template :) and fill in all the required fields.

Dify version

0.6.7

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

I deployed Dify in K8s. Mount the 'api/storage/privatekeys' by PVC. I upgrade the Dify docker image to 0.6.7.

✔️ Expected Behavior

Dify should work, all existing work flow, knowledge base, model providers remain as-is.

❌ Actual Behavior

Couldn't load knowledge base, model providers, the error log shows libs.rsa.PrivkeyNotFoundError: Private key not found, tenant_id:.... By attaching into the image console, an IO error pops up while ls /app/api/storage/privatekeys. Seems the folder "privatekeys" is corrupted. This error actually happens every time when I upgrade the docker image version. One observation might relevant to this issue is that the tennant_id was regenerated in every upgrades.

May 10 '24 04:05 thiner

To address the libs.rsa.PrivkeyNotFoundError and the issue with the tenant_id regenerating after upgrading Dify in your Kubernetes deployment, follow these steps:

Verify PVC Configuration: Ensure the api/storage/privatekeys directory is correctly mounted to a Persistent Volume through a Persistent Volume Claim. This is crucial for data persistence across pod restarts and upgrades.
Regenerate Encryption Keys: If the "privatekeys" folder appears corrupted, you might need to regenerate the encryption keys. This can be done by executing a command within your Docker container to reset the encryption key pair. However, ensure you have backups of the original keys if needed for decrypting existing data.
Handle tenant_id Regeneration: The regeneration of tenant_id during upgrades is likely due to a specific Alembic migration script. Modifying or removing the part of the script that causes tenant_id to change could prevent this issue, but be cautious as it might affect other functionalities.
Backup and Upgrade: Always backup your data before performing an upgrade. After pulling the latest Dify docker image and applying Kubernetes deployment changes, ensure that the PVCs are correctly referenced to maintain data persistence.
Post-Upgrade Checks: After upgrading, verify the application's functionality and check that the tenant_id remains unchanged. Monitor the logs for any errors related to encryption keys and ensure they are functioning as expected.

For detailed guidance, refer to the Dify documentation. This approach should help mitigate the issues you're facing with upgrades and data persistence.

Details

Sources

About Dosu This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

May 10 '24 04:05 dosubot[bot]

Maybe it's related to k8s itself. Since we testing in compose files, it doesn't need any further steps besides the docker compose pull.

May 10 '24 05:05 crazywoola

Maybe it's related to k8s itself. Since we testing in compose files, it doesn't need any further steps besides the docker compose pull.

In what kind of condition the tenant_id will be regenerated?

May 10 '24 05:05 thiner

Problem solved. The root cause is that dify-api and dify-worker were deployed to different nodes, thus one of the deployment couldn't access path /app/api/storage. For anyone encountered this error while deploying on K8s, you should make sure dify-api and dify-worker are deployed to the same node. You can achieve the goal by setting the "nodeSelector" in deployment yaml file.

May 10 '24 10:05 thiner