dify icon indicating copy to clipboard operation
dify copied to clipboard

Incorrect Authorization Header Timestamp Causing Azure Blob Access Failure After 1 Hour

Open zolgear opened this issue 1 year ago • 3 comments

Self Checks

  • [X] This is only for bug report, if you would like to ask a question, please head to Discussions.
  • [X] I have searched for existing issues search for existing issues, including closed ones.
  • [X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [X] Please do not modify this template :) and fill in all the required fields.

Dify version

0.6.9

Cloud or Self Hosted

Self Hosted (Docker), Self Hosted (Source)

Steps to reproduce

Summary

When using STORAGE_TYPE: azure-blob to upload files to Azure Blob Storage, access to the blob fails after one hour from the container start time due to an invalid timestamp in the authorization header.

Steps to Reproduce

  1. Start a container with STORAGE_TYPE: azure-blob configured.
  2. Upload a file to Azure Blob Storage.
  3. Wait for one hour from the container start time.
  4. Attempt to upload another file to Azure Blob Storage.

Observed Behavior

After one hour from the container start time, the API container fails to access Azure Blob Storage due to an authentication error. The error logs show that the authorization header contains an expiry time that is set to one hour before the current time, leading to a 403 error.

Error Log
INFO:werkzeug:172.18.0.9 - - [29/May/2024 05:40:27] "GET /console/api/version?current_version=0.6.9 HTTP/1.1" 200 -
ERROR:app:Exception on /console/api/files/upload [POST]
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 880, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 865, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)  # type: ignore[no-any-return]
  File "/usr/local/lib/python3.10/site-packages/flask_restful/__init__.py", line 489, in wrapper
    resp = resource(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/flask/views.py", line 110, in view
    return current_app.ensure_sync(self.dispatch_request)(**kwargs)  # type: ignore[no-any-return]
  File "/usr/local/lib/python3.10/site-packages/flask_restful/__init__.py", line 604, in dispatch_request
    resp = meth(*args, **kwargs)
  File "/app/api/controllers/console/setup.py", line 86, in decorated
    return view(*args, **kwargs)
  File "/app/api/libs/login.py", line 91, in decorated_view
    return current_app.ensure_sync(func)(*args, **kwargs)
  File "/app/api/controllers/console/wraps.py", line 21, in decorated
    return view(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/flask_restful/__init__.py", line 696, in wrapper
    resp = f(*args, **kwargs)
  File "/app/api/controllers/console/wraps.py", line 80, in decorated
    return view(*args, **kwargs)
  File "/app/api/controllers/console/datasets/file.py", line 55, in post
    upload_file = FileService.upload_file(file, current_user)
  File "/app/api/services/file_service.py", line 70, in upload_file
    storage.save(file_key, file_content)
  File "/app/api/extensions/ext_storage.py", line 39, in save
    self.storage_runner.save(filename, data)
  File "/app/api/extensions/storage/azure_storage.py", line 29, in save
    blob_container.upload_blob(filename, data)
  File "/usr/local/lib/python3.10/site-packages/azure/core/tracing/decorator.py", line 78, in wrapper_use_tracer
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/azure/storage/blob/_container_client.py", line 941, in upload_blob
    blob.upload_blob(
  File "/usr/local/lib/python3.10/site-packages/azure/core/tracing/decorator.py", line 78, in wrapper_use_tracer
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/azure/storage/blob/_blob_client.py", line 713, in upload_blob
    return upload_block_blob(**options)
  File "/usr/local/lib/python3.10/site-packages/azure/storage/blob/_upload_helpers.py", line 168, in upload_block_blob
    process_storage_error(error)
  File "/usr/local/lib/python3.10/site-packages/azure/storage/blob/_shared/response_handlers.py", line 177, in process_storage_error
    exec("raise error from None")   # pylint: disable=exec-used # nosec
  File "<string>", line 1, in <module>
azure.core.exceptions.ClientAuthenticationError: Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.
RequestId:5ae2bc27-a01e-0002-148a-b1a884000000
Time:2024-05-29T05:41:59.7021983Z
ErrorCode:AuthenticationFailed
authenticationerrordetail:Signed expiry time [Wed, 29 May 2024 05:21:20 GMT] has to be after signed start time [Wed, 29 May 2024 05:41:59 GMT]
Content: <?xml version="1.0" encoding="utf-8"?><Error><Code>AuthenticationFailed</Code><Message>Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.
RequestId:5ae2bc27-a01e-0002-148a-b1a884000000
Time:2024-05-29T05:41:59.7021983Z</Message><AuthenticationErrorDetail>Signed expiry time [Wed, 29 May 2024 05:21:20 GMT] has to be after signed start time [Wed, 29 May 2024 05:41:59 GMT]</AuthenticationErrorDetail></Error>
INFO:werkzeug:172.18.0.9 - - [29/May/2024 05:41:59] "POST /console/api/files/upload?source=datasets HTTP/1.1" 500 -
Container Start Time

"StartedAt": "2024-05-29T04:20:39.885118898Z"

Additional Information

  • Debug logs indicate that the expiry time (se parameter) in the BLOB access URL is consistently set to a past timestamp.
  • Restarting the container temporarily resolves the issue, but it reoccurs after one hour.

Suggested Fix

Investigate and correct the logic that sets the expiry time in the authorization header to ensure it is always set to a future time relative to the current time, not based on the container start time.

✔️ Expected Behavior

The authorization header should have the correct expiry time set, allowing continuous access to Azure Blob Storage without failure, even after one hour from the container start time.

❌ Actual Behavior

When creating a knowledge entry, an error occurs when dragging and dropping a file.

zolgear avatar May 29 '24 06:05 zolgear

To resolve the issue where the authorization header timestamp for Azure Blob Storage becomes incorrect one hour after the container starts, causing access failures in a self-hosted environment with Dify version 0.6.9, you should ensure that the system time of your server is correctly synchronized with a reliable time source. This can be done by configuring NTP (Network Time Protocol) on your server.

Here is an example of how to configure NTP on a Linux server:

  1. Install NTP:

    sudo apt-get update
    sudo apt-get install ntp
    
  2. Configure NTP: Edit the NTP configuration file /etc/ntp.conf to ensure it includes reliable NTP servers. For example:

    server 0.pool.ntp.org iburst
    server 1.pool.ntp.org iburst
    server 2.pool.ntp.org iburst
    server 3.pool.ntp.org iburst
    
  3. Restart NTP service:

    sudo systemctl restart ntp
    
  4. Verify NTP synchronization:

    ntpq -p
    

Additionally, ensure that the SAS token generation in your AzureStorage class is correctly configured to handle the expiry time. The expiry parameter in the generate_account_sas function is set to one hour from the current time, which means the token will expire after one hour. You may need to refresh the SAS token periodically to avoid access failures.

Here is the relevant code snippet from api/extensions/storage/azure_storage.py:

sas_token = generate_account_sas(
    account_name=app_config.get('AZURE_BLOB_ACCOUNT_NAME'),
    account_key=app_config.get('AZURE_BLOB_ACCOUNT_KEY'),
    resource_types=ResourceTypes(service=True, container=True, object=True),
    permission=AccountSasPermissions(read=True, write=True, delete=True, list=True, add=True, create=True),
    expiry=datetime.now(timezone.utc).replace(tzinfo=None) + timedelta(hours=1)
)

By ensuring that your server's time is correctly synchronized and considering the SAS token expiry, the authorization header timestamp for Azure Blob Storage should remain accurate, preventing access failures [1][2].

Details

About Dosu This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

dosubot[bot] avatar May 29 '24 06:05 dosubot[bot]

@dosu I also suspected that the sas_token in the provided code was the cause and tried changing the expiry time, but the timestamp of the parameter didn't change, so I couldn't fix it.

zolgear avatar May 29 '24 07:05 zolgear

To ensure the expiry time in the SAS token is correctly set to a future time relative to the current time, you should verify that the datetime.now(timezone.utc) function is being called correctly and that the system time is synchronized. Here is the relevant code snippet for generating the SAS token:

from datetime import datetime, timedelta, timezone

sas_token = generate_account_sas(
    account_name=app_config.get('AZURE_BLOB_ACCOUNT_NAME'),
    account_key=app_config.get('AZURE_BLOB_ACCOUNT_KEY'),
    resource_types=ResourceTypes(service=True, container=True, object=True),
    permission=AccountSasPermissions(read=True, write=True, delete=True, list=True, add=True, create=True),
    expiry=datetime.now(timezone.utc) + timedelta(hours=1)
)

This code sets the expiry time to one hour from the current time. If the timestamp is not updating correctly, consider the following steps:

  1. Check System Time Synchronization: Ensure your server's system time is synchronized with a reliable time source using NTP. This can prevent discrepancies in time calculations.

  2. Debugging: Add logging to verify the current time and the calculated expiry time when generating the SAS token. For example:

    import logging
    
    current_time = datetime.now(timezone.utc)
    expiry_time = current_time + timedelta(hours=1)
    logging.debug(f"Current time: {current_time}, Expiry time: {expiry_time}")
    
    sas_token = generate_account_sas(
        account_name=app_config.get('AZURE_BLOB_ACCOUNT_NAME'),
        account_key=app_config.get('AZURE_BLOB_ACCOUNT_KEY'),
        resource_types=ResourceTypes(service=True, container=True, object=True),
        permission=AccountSasPermissions(read=True, write=True, delete=True, list=True, add=True, create=True),
        expiry=expiry_time
    )
    
  3. Environment Configuration: Ensure that the environment variables and configurations are correctly set and that there are no overrides affecting the time calculations.

By following these steps, you can diagnose and ensure that the expiry time is correctly set, preventing access failures due to incorrect timestamps [1].

Details

About Dosu This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

dosubot[bot] avatar May 29 '24 07:05 dosubot[bot]

refs #4911

I have confirmed the fix in version 0.6.10. Thank you for the code changes!

zolgear avatar Jun 07 '24 06:06 zolgear