badgerdoc
badgerdoc copied to clipboard
feat: azure blob storage support
This PR integrates Azure Blob Storage into BadgerDoc and eliminates the use of boto3, minio, and aioboto3 across all microservices. Additionally, it standardizes the storage configuration.
Migration from previous version
Backward Compatibility
- In the previous version, when a
signed url
was passed from BadgerDoc to Airflow/Databricks, the parameter was nameds3_signed_url
. The current version renames this parameter tosigned_url
by default. However, to maintain backward compatibility, the parameterJOBS_SIGNER_URL_KEY_NAME
is used to rename the signed URL key in the arguments passed. For example, settingJOBS_SIGNER_URL_KEY_NAME=s3_signed_url
retains the original parameter name. - The parameter
S3_PRE_SIGNED_EXPIRES_HOURS
has been renamed toJOBS_SIGNED_URL_TTL
, and its values are now set in minutes rather than hours.
.env migration
- Rename
S3_PRE_SIGNED_EXPIRES_HOURS
->JOBS_SIGNED_URL_TTL
- Rename
JOBS_RUN_PIPELINES_WITH_SIGNED_URL
->JOBS_SIGNED_URL_ENABLED
files migration from Minio / S3 into Azure Blob Storage
-TBC-
Removed microservices
- [ ] Remove convert microservice
Removed or skipped tests
-TBC-
Known issues
- [x] Python3.12 base image can't be built
- [ ] Clear build dir after building Python3.12 image
- [ ] Azure and Minio installations should automatically create containers or buckets upon an admin login