Bucket mount names ("Duplicate action name")
I'm writing a script that calls from two (potentially) different gcsfuse-mounted sources. In my testbed, they both happen to be on the same bucket, but in reality, they won't be. So, I tried to --mount the same bucket twice under different names. However, it seems that the naming is related to the bucket, rather than to the alias, so doing this fails. Maybe this is as intended, but it doesn't seem desirable.
2019-04-29 09:24:17.941173: Exception HttpError: <HttpError 400 when requesting https://genomics.googleapis.com/v2alpha1/pipelines:run?alt=json returned "Error: validating pipeline: duplicate action name "mount-ukbb_v2"">
Traceback (most recent call last):
File "/home/james/anaconda2/bin/dsub", line 11, in <module>
load_entry_point('dsub==0.3.1', 'console_scripts', 'dsub')()
File "/home/james/anaconda2/lib/python2.7/site-packages/dsub-0.3.1-py2.7.egg/dsub/commands/dsub.py", line 956, in main
dsub_main(prog, argv)
File "/home/james/anaconda2/lib/python2.7/site-packages/dsub-0.3.1-py2.7.egg/dsub/commands/dsub.py", line 945, in dsub_main
launched_job = run_main(args)
File "/home/james/anaconda2/lib/python2.7/site-packages/dsub-0.3.1-py2.7.egg/dsub/commands/dsub.py", line 1028, in run_main
unique_job_id=args.unique_job_id)
File "/home/james/anaconda2/lib/python2.7/site-packages/dsub-0.3.1-py2.7.egg/dsub/commands/dsub.py", line 1117, in run
launched_job = provider.submit_job(job_descriptor, skip)
File "/home/james/anaconda2/lib/python2.7/site-packages/dsub-0.3.1-py2.7.egg/dsub/providers/google_v2.py", line 915, in submit_job
task_id = self._submit_pipeline(request)
File "/home/james/anaconda2/lib/python2.7/site-packages/dsub-0.3.1-py2.7.egg/dsub/providers/google_v2.py", line 866, in _submit_pipeline
self._service.pipelines().run(body=request))
File "build/bdist.linux-x86_64/egg/retrying.py", line 49, in wrapped_f
File "build/bdist.linux-x86_64/egg/retrying.py", line 206, in call
File "build/bdist.linux-x86_64/egg/retrying.py", line 247, in get
File "build/bdist.linux-x86_64/egg/retrying.py", line 200, in call
File "build/bdist.linux-x86_64/egg/retrying.py", line 49, in wrapped_f
File "build/bdist.linux-x86_64/egg/retrying.py", line 206, in call
File "build/bdist.linux-x86_64/egg/retrying.py", line 247, in get
File "build/bdist.linux-x86_64/egg/retrying.py", line 200, in call
File "/home/james/anaconda2/lib/python2.7/site-packages/dsub-0.3.1-py2.7.egg/dsub/providers/google_base.py", line 593, in execute
raise exception
googleapiclient.errors.HttpError: <HttpError 400 when requesting https://genomics.googleapis.com/v2alpha1/pipelines:run?alt=json returned "Error: validating pipeline: duplicate action name "mount-ukbb_v2"">
Hi @carbocation !
What is the use case for requesting that the same bucket be mounted twice?
I have concerns that GCSfuse is already a fragile enough solution that having a bucket mounted twice within a single dsub task may be setting yourself up for a bad day.
Is the typical Input and Output File Handling insufficient for your use case?
Thanks.
I tried to describe the use case in the first post, but I can add more color if I did not convey the use case very well. Basically, this is not a "need," it just seems like something that should be possible and it was surprising, as a user, that it didn't work. If it's not possible, or increases risk, then no worries.
Got it. I let's leave this open and we will document that buckets should only be mounted once and that people should use --env variables to point to specific locations inside of a mount. Should bucket mounting be the actual solution they need.
Thanks!