hail icon indicating copy to clipboard operation
hail copied to clipboard

[query] validation system needs to gracefully handle public access buckets

Open danking opened this issue 4 months ago • 0 comments

What happened?

Public access buckets typically grant

- members:
  - allUsers
  role: roles/storage.objectViewer

which permits

resourcemanager.projects.get
resourcemanager.projects.list
storage.managedFolders.get
storage.managedFolders.list
storage.objects.get
storage.objects.list

Notably excluding

storage.buckets.get

Which is necessary for getting metadata like storage class about a bucket.

Reported here: https://hail.zulipchat.com/#narrow/stream/123010-Hail-Query-0.2E2-support/topic/No.20storage.2Ebuckets.2Eget.20access.20to.20gs.3A.2F.2Fhail-common

Version

0.2.127

Relevant log output

Example code:

rg37.add_sequence(
    "gs://hail-common/references/human_g1k_v37.fasta.gz",
    "gs://hail-common/references/human_g1k_v37.fasta.fai"
)
Welcome to
     __  __     <>__
    / /_/ /__  __/ /
   / __  / _ `/ / /
  /_/ /_/\_,_/_/_/   version 0.2.127-bb535cd096c5
LOGGING: writing to /Users/mkanai/Dropbox/Workspace/github.com/mkanai/immune_v2f/python/hail-20240214-1046-0.2.127-bb535cd096c5.log
Traceback (most recent call last):
  File "/Users/mkanai/Dropbox/Workspace/github.com/mkanai/immune_v2f/python/annotate_base_editing_variants.py", line 21, in <module>
    rg37.add_sequence(
  File "<decorator-gen-34>", line 2, in add_sequence
  File "/Users/mkanai/.anyenv/envs/pyenv/versions/anaconda3-2022.05/lib/python3.9/site-packages/hail/typecheck/check.py", line 584, in wrapper
    return __original_func(*args_, **kwargs_)
  File "/Users/mkanai/.anyenv/envs/pyenv/versions/anaconda3-2022.05/lib/python3.9/site-packages/hail/genetics/reference_genome.py", line 390, in add_sequence
    Env.backend().add_sequence(self.name, fasta_file, index_file)
  File "/Users/mkanai/.anyenv/envs/pyenv/versions/anaconda3-2022.05/lib/python3.9/site-packages/hail/backend/service_backend.py", line 548, in add_sequence
    self.validate_file(blob)
  File "/Users/mkanai/.anyenv/envs/pyenv/versions/anaconda3-2022.05/lib/python3.9/site-packages/hail/backend/service_backend.py", line 337, in validate_file
    validate_file(uri, self._async_fs, validate_scheme=True)
  File "/Users/mkanai/.anyenv/envs/pyenv/versions/anaconda3-2022.05/lib/python3.9/site-packages/hailtop/aiotools/validators.py", line 19, in validate_file
    return hail_event_loop().run_until_complete(
  File "/Users/mkanai/.anyenv/envs/pyenv/versions/anaconda3-2022.05/lib/python3.9/site-packages/nest_asyncio.py", line 99, in run_until_complete
    return f.result()
  File "/Users/mkanai/.anyenv/envs/pyenv/versions/anaconda3-2022.05/lib/python3.9/asyncio/futures.py", line 201, in result
    raise self._exception
  File "/Users/mkanai/.anyenv/envs/pyenv/versions/anaconda3-2022.05/lib/python3.9/asyncio/tasks.py", line 256, in __step
    result = coro.send(None)
  File "/Users/mkanai/.anyenv/envs/pyenv/versions/anaconda3-2022.05/lib/python3.9/site-packages/hailtop/aiotools/validators.py", line 38, in _async_validate_file
    if not await fs.is_hot_storage(location):
  File "/Users/mkanai/.anyenv/envs/pyenv/versions/anaconda3-2022.05/lib/python3.9/site-packages/hailtop/aiocloud/aiogoogle/client/storage_client.py", line 630, in is_hot_storage
    return (await self._storage_client.bucket_info(location))["storageClass"].lower() in (
  File "/Users/mkanai/.anyenv/envs/pyenv/versions/anaconda3-2022.05/lib/python3.9/site-packages/hailtop/aiocloud/aiogoogle/client/storage_client.py", line 333, in bucket_info
    return await self.get(f'/b/{bucket}', **kwargs)
  File "/Users/mkanai/.anyenv/envs/pyenv/versions/anaconda3-2022.05/lib/python3.9/site-packages/hailtop/aiocloud/common/base_client.py", line 25, in get
    return await self.request('GET', *args, **kwargs)
  File "/Users/mkanai/.anyenv/envs/pyenv/versions/anaconda3-2022.05/lib/python3.9/site-packages/hailtop/aiocloud/common/base_client.py", line 21, in request
    async with await self._session.request(method, url, **kwargs) as resp:
  File "/Users/mkanai/.anyenv/envs/pyenv/versions/anaconda3-2022.05/lib/python3.9/site-packages/hailtop/aiocloud/common/session.py", line 103, in request
    return await retry_transient_errors(self._request_with_valid_authn, method, url, **kwargs)
  File "/Users/mkanai/.anyenv/envs/pyenv/versions/anaconda3-2022.05/lib/python3.9/site-packages/hailtop/utils/utils.py", line 769, in retry_transient_errors
    return await retry_transient_errors_with_debug_string('', 0, f, *args, **kwargs)
  File "/Users/mkanai/.anyenv/envs/pyenv/versions/anaconda3-2022.05/lib/python3.9/site-packages/hailtop/utils/utils.py", line 785, in retry_transient_errors_with_debug_string
    return await f(*args, **kwargs)
  File "/Users/mkanai/.anyenv/envs/pyenv/versions/anaconda3-2022.05/lib/python3.9/site-packages/hailtop/aiocloud/common/session.py", line 115, in _request_with_valid_authn
    return await self._http_session.request(method, url, **kwargs)
  File "/Users/mkanai/.anyenv/envs/pyenv/versions/anaconda3-2022.05/lib/python3.9/site-packages/hailtop/httpx.py", line 138, in request_and_raise_for_status
    raise ClientResponseError(
hailtop.httpx.ClientResponseError: 403, message='Forbidden', url=URL('https://storage.googleapis.com/storage/v1/b/hail-common?userProject=finngen-xavier') body='{\n  "error": {\n    "code": 403,\n    "message": "[email protected] does not have storage.buckets.get access to the Google Cloud Storage bucket. Permission \'storage.buckets.get\' denied on resource (or it may not exist).",\n    "errors": [\n      {\n        "message": "[email protected] does not have storage.buckets.get access to the Google Cloud Storage bucket. Permission \'storage.buckets.get\' denied on resource (or it may not exist).",\n        "domain": "global",\n        "reason": "forbidden"\n      }\n    ]\n  }\n}\n'

danking avatar Feb 14 '24 15:02 danking