b2-sdk-python
b2-sdk-python copied to clipboard
Bucket.download_file_by_name fails when getting bucket with B2Api.get_bucket_by_id
After getting a Bucket instance by calling B2Api.get_bucket_by_id(), I'm unable to download files with bucket.download_file_by_name(). A minimal example of the problem:
from b2sdk.v1 import InMemoryAccountInfo, B2Api, DownloadDestLocalFile
app_id = '<app_id>'
app_key = '<app_key>'
bucket_id = '<bucket_id>'
bucket_name = '<bucket_name>'
b2 = B2Api(InMemoryAccountInfo())
b2.authorize_account('production', app_id, app_key)
bucket = b2.get_bucket_by_id(bucket_id)
# bucket = b2.get_bucket_by_name(bucket_name)
dst = DownloadDestLocalFile('local_file_name')
bucket.download_file_by_name('<b2_file_path>', dst)
If the commented line that calls get_bucket_by_name() is used instead, the snippet works as expected.
Tried both v1.0.2 on PyPI and installing directly from master (more specifically 1707190972a5d6807c7d928dd5d295dce6920915).
The problem seems to be that the Bucket class assumes that its name attribute is always set in download_file_by_name() which is not a correct assumption when the bucket is instantiated with B2Api.get_bucket_by_id().
Is this a bug or have I missed something in the documentation?
It is a bug. Since you have figured out what it is caused by, perhaps you'd like to take a stab at fixing it?
I could try, I'm not sure what the "correct" way to fix it is though. From what I can tell, the bucket name is necessary to have in order to build a URL to download a file by name? What would be the best way to fetch the name given that the current Bucket instance will only have its id attribute?
The solution I can think of off the top of my head is to call list_buckets() and iterate to find the ID but that feels a bit icky. I'm not deeply familiar with the API so maybe there's a better way? Another idea that came to mind was to call get_allowed() on the account info and get the bucket name that way but I'm guessing that will only work if the key is restricted to a bucket?
Another thing, the docstring of get_bucket_by_id() specifically mentions that no API call is necessary when its called. If the solution is to call list_buckets() when self.name is None in download_file_by_name(), perhaps the docstring should be updated to note that an extra API call will be necessary when downloading files by name?
We might try to fetch the name from the local cache, it should be there. If it's not, then we could check if it's in "allowed", but that will be blank in some cases, as you mentioned. Finally, we should refresh the bucket-to-id mapping and cache it.
This requires changing the cache (account_info, really) interface though - the currrent implementation does not provide id-to-name mapping (only name-to-id). It would be nice if we could figure out a way to do this without breaking every non-sdk implementation of cache/account_info (maybe resort to list_buckets that causes a cache refresh if cache implementation is not capable of reverse mapping?).