badgerdoc icon indicating copy to clipboard operation
badgerdoc copied to clipboard

feat: azure blob storage support

Open khyurri opened this issue 7 months ago • 0 comments

This PR integrates Azure Blob Storage into BadgerDoc and eliminates the use of boto3, minio, and aioboto3 across all microservices. Additionally, it standardizes the storage configuration.

Migration from previous version

Backward Compatibility

  1. In the previous version, when a signed url was passed from BadgerDoc to Airflow/Databricks, the parameter was named s3_signed_url. The current version renames this parameter to signed_url by default. However, to maintain backward compatibility, the parameter JOBS_SIGNER_URL_KEY_NAME is used to rename the signed URL key in the arguments passed. For example, setting JOBS_SIGNER_URL_KEY_NAME=s3_signed_url retains the original parameter name.
  2. The parameter S3_PRE_SIGNED_EXPIRES_HOURS has been renamed to JOBS_SIGNED_URL_TTL, and its values are now set in minutes rather than hours.

.env migration

  1. Rename S3_PRE_SIGNED_EXPIRES_HOURS -> JOBS_SIGNED_URL_TTL
  2. Rename JOBS_RUN_PIPELINES_WITH_SIGNED_URL -> JOBS_SIGNED_URL_ENABLED

files migration from Minio / S3 into Azure Blob Storage

-TBC-

Removed microservices

  • [ ] Remove convert microservice

Removed or skipped tests

-TBC-

Known issues

  • [x] Python3.12 base image can't be built
  • [ ] Clear build dir after building Python3.12 image
  • [ ] Azure and Minio installations should automatically create containers or buckets upon an admin login

khyurri avatar Jul 18 '24 22:07 khyurri