google-cloud-java icon indicating copy to clipboard operation
google-cloud-java copied to clipboard

Batch - Support setting the working directory

Open loicmathieu opened this issue 10 months ago • 16 comments

Is your feature request related to a problem? Please describe. When you create a Batch Job, there is no way to set the working directory; the documentation says to cd WORKING_DIRECTORY. It would be convenient, especially when working with volumes, to be able to se the working directory to the volume mount path.

Describe the solution you'd like We should be able to set the task runnable working directory or at least the container when using containers as container engine support setting the working directory.

runnableBuilder.setWorkingDirectory(MOUNT_PATH);

// OR

containerBuilder.setWorkingDirectory(MOUNT_PATH);

Describe alternatives you've considered Using cd as a first command as the documentation explain.

loicmathieu avatar Apr 09 '24 11:04 loicmathieu

What Maven artifact do you use?

the documentation says

Would you share URL of the document?

suztomo avatar Apr 09 '24 13:04 suztomo

What Maven artifact do you use?

We're using com.google.cloud:google-cloud-batch.

The documentation I use is here: https://cloud.google.com/batch/docs/create-run-job-storage#console It didn't explain how to change to current working directory in Java but only a general remark: image

loicmathieu avatar Apr 09 '24 13:04 loicmathieu

Checking

suztomo avatar Apr 09 '24 17:04 suztomo

That "cd" command is just to demonstrate how to create a file in the persistent disk. In fact, in the gcloud example, the full path to the file is specified without "cd" command:

image

Therefore, you shouldn't be blocked by the lack of setting working directory.

suztomo avatar Apr 09 '24 17:04 suztomo

Yes, I realize that, what I ask is to be able, via the Java SDK, to set the working directory.

I only refer to the cd command on the doc to explain what I want to avoid, using cd or a direct reference to the working directory.

We mount a volume in the batch job, this volume is created from a bucket where we create a random folder for each Job. We use this random folder as the mount path so we would like to set the working directory of the Job to the mount path as we upload files in the bucket before launching the job so it would be more convenient.

We have a generic task runner that can run arbitrary scripts (shell, Python, Julia, ...) in a lot of different platform including Google Cloud so we need to offer a coherent way of working on all environments and the user of our tasks runner doesn't know anything about Google Batch or any other supported task runner.

loicmathieu avatar Apr 09 '24 17:04 loicmathieu

a generic task runner

I'd like to know more about the abstraction. Is that Docker containers (e.g., https://cloud.google.com/batch/docs/samples/batch-create-container-job)?

suztomo avatar Apr 09 '24 17:04 suztomo

Yes we use container Job.

The abstraction that our user use are very high level, they describe a task in YAML and a runner (Docker, k8s, Google Batch, AWS Batch, ...) and we create the needed resources on the target runner.

loicmathieu avatar Apr 10 '24 07:04 loicmathieu

If you use Docker container, doesn't WORKDIR (https://docs.docker.com/reference/dockerfile/#workdir) solve your problem?

suztomo avatar Apr 11 '24 15:04 suztomo

I don't understand, I'm using the Batch Java SDK not the Docker SDK. There is not way to set the working directory via the Batch Java SDK, this is why I opened this issue.

loicmathieu avatar Apr 11 '24 16:04 loicmathieu

(Re-reading your comments) Am I right this is randomness is the reason why Dockerfile's WORKDIR does not work for your case?

We mount a volume in the batch job, this volume is created from a bucket where we create a random folder for each Job. We use this random folder as the mount path so we would like to set the working directory of the Job to the mount path as we upload files in the bucket before launching the job so it would be more convenient.

suztomo avatar Apr 11 '24 16:04 suztomo

We run arbitrary Docker images that the user can supply so we cannot rely on the Dockerfile's WORKDIR, and even if we could, we create a random directory (we should be able to overcome this but the first reason will still stood).

loicmathieu avatar Apr 11 '24 17:04 loicmathieu

For the record, CloudRun allow setting a working directory from the SDK when creating a container

loicmathieu avatar Apr 19 '24 11:04 loicmathieu

Can I get the URL you read for Cloud Run's case?

suztomo avatar Apr 19 '24 12:04 suztomo

https://github.com/googleapis/google-cloud-java/blob/02d2b5eab0a9cef7f4db703d6c4a1d7577868108/java-run/proto-google-cloud-run-v2/src/main/java/com/google/cloud/run/v2/Container.java#L3627

loicmathieu avatar Apr 19 '24 12:04 loicmathieu

Thank you for the reference. I explored the API definition but didn't find similar properties in the Batch API (https://cloud.google.com/batch/docs/reference/rest/v1/projects.locations.jobs#Job). Unfortunately this problem cannot be solved by Java SDK (this repository). Would you sent the feature request of setting the working directory through https://cloud.google.com/batch/docs/get-started#get-support?

To provide any feedback or feature requests for Batch, ... For all other feedback about Batch, select "Product feedback."

suztomo avatar Apr 19 '24 19:04 suztomo

Done: https://issuetracker.google.com/issues/336164416

loicmathieu avatar Apr 22 '24 08:04 loicmathieu

Thank you for the link! Closing this issue in this repo since it is dependent on https://issuetracker.google.com/issues/336164416.

mpeddada1 avatar Jul 19 '24 17:07 mpeddada1