modelstore
modelstore copied to clipboard
Anonymous access of GCP bucket fails with `ValueError: Anonymous credentials cannot be refreshed.`
Affects modelstore 0.0.74.
To reproduce:
# create a new environment (Python 3.8)
python -m venv env
source env/bin/activate
# install modelstore and GCP CLI
pip install modelstore google-cloud-storage
python
Python 3.8.8 (default, Apr 4 2021, 16:02:17)
[GCC 10.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from modelstore import ModelStore
>>> model_store = ModelStore.from_gcloud(bucket_name="xai-demo-models")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/kilian/Documents/Ulm/GitHub/modelstorereplica/env/lib/python3.8/site-packages/modelstore/model_store.py", line 90, in from_gcloud
return ModelStore(
File "<string>", line 4, in __init__
File "/home/kilian/Documents/Ulm/GitHub/modelstorereplica/env/lib/python3.8/site-packages/modelstore/model_store.py", line 105, in __post_init__
if not self.storage.validate():
File "/home/kilian/Documents/Ulm/GitHub/modelstorereplica/env/lib/python3.8/site-packages/modelstore/storage/gcloud.py", line 128, in validate
if not self.bucket.exists():
File "/home/kilian/Documents/Ulm/GitHub/modelstorereplica/env/lib/python3.8/site-packages/google/cloud/storage/bucket.py", line 843, in exists
client._get_resource(
File "/home/kilian/Documents/Ulm/GitHub/modelstorereplica/env/lib/python3.8/site-packages/google/cloud/storage/client.py", line 366, in _get_resource
return self._connection.api_request(
File "/home/kilian/Documents/Ulm/GitHub/modelstorereplica/env/lib/python3.8/site-packages/google/cloud/storage/_http.py", line 73, in api_request
return call()
File "/home/kilian/Documents/Ulm/GitHub/modelstorereplica/env/lib/python3.8/site-packages/google/api_core/retry.py", line 283, in retry_wrapped_func
return retry_target(
File "/home/kilian/Documents/Ulm/GitHub/modelstorereplica/env/lib/python3.8/site-packages/google/api_core/retry.py", line 190, in retry_target
return target()
File "/home/kilian/Documents/Ulm/GitHub/modelstorereplica/env/lib/python3.8/site-packages/google/cloud/_http/__init__.py", line 482, in api_request
response = self._make_request(
File "/home/kilian/Documents/Ulm/GitHub/modelstorereplica/env/lib/python3.8/site-packages/google/cloud/_http/__init__.py", line 341, in _make_request
return self._do_request(
File "/home/kilian/Documents/Ulm/GitHub/modelstorereplica/env/lib/python3.8/site-packages/google/cloud/_http/__init__.py", line 379, in _do_request
return self.http.request(
File "/home/kilian/Documents/Ulm/GitHub/modelstorereplica/env/lib/python3.8/site-packages/google/auth/transport/requests.py", line 526, in request
self.credentials.refresh(auth_request)
File "/home/kilian/Documents/Ulm/GitHub/modelstorereplica/env/lib/python3.8/site-packages/google/auth/credentials.py", line 173, in refresh
raise ValueError("Anonymous credentials cannot be refreshed.")
ValueError: Anonymous credentials cannot be refreshed.
I remember encountering and resolving this issue while working on #142. We should have a look at the changes introduced by #161.
Output of pip freeze
:
cachetools==5.0.0
certifi==2021.10.8
charset-normalizer==2.0.12
click==8.1.3
gitdb==4.0.9
GitPython==3.1.27
google-api-core==2.7.3
google-auth==2.6.6
google-cloud-core==2.3.0
google-cloud-storage==2.3.0
google-crc32c==1.3.0
google-resumable-media==2.3.2
googleapis-common-protos==1.56.0
idna==3.3
joblib==1.1.0
modelstore==0.0.74
numpy==1.22.3
protobuf==3.20.1
pyasn1==0.4.8
pyasn1-modules==0.2.8
requests==2.27.1
rsa==4.8
six==1.16.0
smmap==5.0.0
tqdm==4.64.0
urllib3==1.26.9
Thanks for reporting! I'll try and investigate soon, but am away atm. If you spot anything in the 2nd PR you mentioned please let me know
@ionicsolutions Can I use the "xai-demo-models"
bucket for testing as well? I'm going to re-run your code above. Otherwise I'll create a testing-only public GCS container.
@nlathia Sure, go ahead and use it for now! It contains one model in one domain.
Just to log my investigation --
When trying to replicate this, the first error I ran into was because I have some environment variables set for GCP (which modelstore
retrieves here) and this lead to a slightly different exception:
raise exceptions.from_http_response(response)
google.api_core.exceptions.Forbidden: 403 GET https://storage.googleapis.com/storage/v1/b/xai-demo-models?projection=noAcl&prettyPrint=false: <service-account-name> does not have storage.buckets.get access to the Google Cloud Storage bucket.
But when I removed those environment variables, I was able to replicate this:
raise ValueError("Anonymous credentials cannot be refreshed.")
ValueError: Anonymous credentials cannot be refreshed.
Similar errors have been reported here:
- https://github.com/mlflow/mlflow/issues/2925
- https://github.com/googleapis/python-storage/issues/102
I've managed to reproduce this error without modelstore
. It is triggered when bucket.exists()
is called, which is what we use in modelstore
when validate()
'ing that the GCP storage can be used.
Python 3.8.12 (default, Mar 24 2022, 23:17:02)
[Clang 13.0.0 (clang-1300.0.29.30)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from google.cloud import storage
>>> bucket_name = "xai-demo-models"
>>> client = storage.Client.create_anonymous_client()
>>> bucket = client.bucket(bucket_name=bucket_name)
>>> bucket.exists()
[...]
File "/Users/neallathia/.pyenv/versions/modelstore-dev-3-8-12/lib/python3.8/site-packages/google/auth/credentials.py", line 173, in refresh
raise ValueError("Anonymous credentials cannot be refreshed.")
ValueError: Anonymous credentials cannot be refreshed.
I believe the problem is that the bucket.exists()
function is not enabled for anonymous clients. From the docs:
Such a client has only limited access to “public” buckets: listing their contents and downloading their blobs.
And I don't get any errors there:
>>> iterator = client.list_blobs(bucket_name)
>>> for i in iterator:
... print(i.name)
...
operatorai-model-store/domains/visual-inspection.json
operatorai-model-store/visual-inspection/2022/03/04/15:01:29/artifacts.tar.gz
operatorai-model-store/visual-inspection/versions/212ec479-f565-4440-aad2-c5f8d2b7d4f1.json
This is also the big difference between the first PR, where I suggested using exists()
and the second PR, where I changed the validate function to use exists()
Update: the exists()
function does appear to work for bucket names that don't exist:
>>> bucket_name = "a-bucket-that-does-not-exist"
>>> client = storage.Client.create_anonymous_client()
>>> bucket = client.bucket(bucket_name=bucket_name)
>>> bucket.exists()
False
Okay, I think that this PR has the fix (based on the above):
- https://github.com/operatorai/modelstore/pull/176
Comments welcome & thanks for raising this again @ionicsolutions.
In short: I try exists()
, if that fails with a ValueError
, I try to list_blobs()
; if that fails with NotFound
then the validation fails.
Just to confirm, this is how it looks for me now!
modelstore-dev-3-8-12 ❯ python
Python 3.8.12 (default, Mar 24 2022, 23:17:02)
[Clang 13.0.0 (clang-1300.0.29.30)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from modelstore import ModelStore
>>> model_store = ModelStore.from_gcloud(bucket_name="xai-demo-models")
IPython could not be loaded!
pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
>>> model_store.list_domains()
['visual-inspection']
>>> model_store.list_models("visual-inspection")
['212ec479-f565-4440-aad2-c5f8d2b7d4f1']
>>> model_store.get_model_info("visual-inspection", "212ec479-f565-4440-aad2-c5f8d2b7d4f1")
{'model': {'domain': 'visual-inspection', 'model_id': '212ec479-f565-4440-aad2-c5f8d2b7d4f1', 'model_type': {'library': 'tensorflow', ...
Thanks for solving this issue so quickly! I can confirm that it works with the latest main
:-)
✅ This was released as part of modelstore==0.0.75
- https://github.com/operatorai/modelstore/pull/201
- https://pypi.org/project/modelstore/0.0.75/
Let me know if you see any other issues!