Requester pays with --mount
I'm trying to use dsub's --mount parameter to mount a read-only bucket hosted on GCP. This bucket is not owned by me and the owner has requester pays set. I've provided --user-project, which I believe tells dsub to use this parameter and billable project with all relevant google calls. Is this information not being provided to gcsfuse?
My command (some parameters omitted for clarity):
dsub \
--provider google-cls-v2 \
--user-project "${PROJECT}"\
--project "${PROJECT}"\
...
--mount FILES=gs://fc-aou-datasets-controlled
Output seems to indicate that the billable project (-u) isn't provided to gcsfuse.
Opening GCS connection...
Using mount point: /mnt/data/mount/gs/fc-aou-datasets-controlled
WARNING: gcsfuse invoked as root. This will cause all files to be owned by
root. If this is not what you intended, invoke gcsfuse as the user that will
be interacting with the file system.
Opening bucket...
Mounting file system...
WARNING, bucket doesn't appear to work: googleapi: Error 400: Bucket is a requester pays bucket but no user project provided., required
Thanks for your attention
Thanks for reporting this @ccario83 !
Yes, it looks like dsub needs to be updated to pass the gcsfuse --billing-project flag when the dsub --user-project flag has been provided. The change would be here:
actions_to_add.extend([
google_v2_pipelines.build_action(
name='mount-{}'.format(bucket),
enable_fuse=True,
run_in_background=True,
image_uri=_GCSFUSE_IMAGE,
mounts=[mnt_datadisk],
commands=[
'--implicit-dirs', '--foreground', '-o ro', bucket,
os.path.join(_DATA_MOUNT_POINT, mount_path)
]),
and would look something like:
mount_command = ['--billing-project', user_project] if user_project else []
mount_command.extend([
'--implicit-dirs', '--foreground', '-o ro', bucket,
os.path.join(_DATA_MOUNT_POINT, mount_path)
])
actions_to_add.extend([
google_v2_pipelines.build_action(
name='mount-{}'.format(bucket),
enable_fuse=True,
run_in_background=True,
image_uri=_GCSFUSE_IMAGE,
mounts=[mnt_datadisk],
commands=mount_command),
We'll look to get that into the next release.
Awesome, thank you for the support!
Just to add to this, I manually inserted a "--billing-project" flag with my project ID hardcoded in into that section of the dsub code and was able to successfully mount the same bucket (fc-aou-datasets-controlled) to a dsub VM. However, trying to list the contents of the directory leads to an Input/Output error. I'm assuming it's permissions related, but I'm not really sure how to proceed.
2023-11-13 20:18:20 INFO: mkdir -m 777 -p /mnt/data/mount/gs/fc-aou-datasets-controlled
Opening GCS connection...
Using mount point: /mnt/data/mount/gs/fc-aou-datasets-controlled
Opening bucket...
WARNING: gcsfuse invoked as root. This will cause all files to be owned by
root. If this is not what you intended, invoke gcsfuse as the user that will
be interacting with the file system.
Mounting file system...
File system has been successfully mounted.
ls: reading directory /mnt/data/mount/gs/fc-aou-datasets-controlled: Input/output error
Thanks for the report @KarlKeat.
I think that you are correct that there's something permissions related, though I too am not sure quite what it might be. I've tested the proposed change myself on a requester pays bucket (all outside of the AoU environment), and access seems to work fine.
The one thing I was going to recommend was in your --script or --command to test:
gsutil ls gs://fc-aou-datasets-controlled, andgsutil cp gs://fc-aou-datasets-controlled/<some-file> .
I'd expect gsutil to be better at surfacing the underlying permissions issues than gcsfuse.
But do you have gsutil in your --image?
Lastly, I can suggest, from this issue adding the --debug_fuse flag in the hope of surfacing more detailed errors.
Hi @ccario83 and @KarlKeat !
We have released 0.4.10, which includes support for passing the user project to mounted buckets.
When you get the chance, please confirm if it resolves your issues.
Thanks!