maxtext
maxtext copied to clipboard
Add support for curl command
Example usage:
python3 multihost_job.py --COMMAND_TYPE=curl --NUM_SLICES=$NUM_SLICES --RUN_NAME=$RUN_NAME --BUCKET_NAME=$BUCKET_NAME --PROJECT=${PROJECT} --ZONE=${ZONE} --TPU_TYPE=$TPU_TYPE --VERSION=$VERSION --COMMAND=""
I was trying out these changes with the curl command and a couple things I noticed so far are
- it did not return an error when using the curl command when there was an existing qr with the same name
- if I wanted to use a custom network like mtu9k there is no way to set it from the command
This looks okay to me, but I will re-iterate my earlier point that Raymond also brought up - the API of multihost_job no longer supports customization of the curl command arguments, e.g. custom network, or best-effort/on-demand/reserved.
We could add back all of the args in https://github.com/google/maxtext/commit/5796aa61a3a83d93cf9c39f6c6f01fc2db386bc0#diff-dfaa2b9b91181624bf5f2a6647006e1abea1ea43c1dd409fe89e6a45958834b1L193 which allows the curl to be customizable, but I think this is may be more confusing to external users who have no reason to curl.
The only reason to use curl instead of CLI is for new features that haven't rolled out to the CLI yet - for this reason it is only internal users and not external customers who need curl, so I think this change is really just for us. Thats why at first I advocated keeping this in a local branch, but I think enough of us depend on CURL that it is convenient for us to push this lightweight curl change to main. If you need to customize the curl (e.g. custom network), you can look at the previous implementation I linked earlier or the curl documentation to figure out how to modify the multihost_job curl command here for your use-case.