server Option for adding or overriding model config attributes at server startup

trafficstars

Is your feature request related to a problem? Please describe. When using Triton server in a deployment with variable hardware configurations (CPU-only for some environments, using GPUs for others) and models stored in S3, we might have to create multiple copies of the model repository just to have multiple versions of the configuration file for each, where we can configure the hardware we want to use for each deployment.

Describe the solution you'd like It would be very helpful to have a config override flag like this:

tritonserver --model-store=s3://my-bucket/my-model-repository --config-override=/mnt/config.pbtxt

This way, I can put most attributes such as the model platform and the inputs/outputs in the config.pbtxt stored in S3, then put attributes such as the instance groups and batch size configuration in the config.pbtxt provided to the server at deployment time.

Describe alternatives you've considered Alternative solutions I've considered include:

Adding a custom rclone command to load the models from S3 then overwrite the config attributes before running the server and pointing to the model repository that's now available locally
Having multiple copies of the model (and the configuration alongside it), but this is far from an ideal solution to be honest

Additional context I'm targeting a Kubernetes deployment where the Triton server is deployed using a custom Helm chart. It's common practice to provide hardware requirements when deploying the chart, and with it, I'm hoping to also pass attributes such as instance_group and have them reflected in my deployment :)

Thank you!

Jun 19 '22 04:06 ashrafguitoni

Thank you for your detailed ticket, Ashraf. I filed an enhancement request for this.

Jun 21 '22 19:06 the-david-oy

Is your feature request related to a problem? Please describe. When using Triton server in a deployment with variable hardware configurations (CPU-only for some environments, using GPUs for others) and models stored in S3, we might have to create multiple copies of the model repository just to have multiple versions of the configuration file for each, where we can configure the hardware we want to use for each deployment.

Describe the solution you'd like It would be very helpful to have a config override flag like this:
tritonserver --model-store=s3://my-bucket/my-model-repository --config-override=/mnt/config.pbtxt
This way, I can put most attributes such as the model platform and the inputs/outputs in the config.pbtxt stored in S3, then put attributes such as the instance groups and batch size configuration in the config.pbtxt provided to the server at deployment time.

Describe alternatives you've considered Alternative solutions I've considered include:

Adding a custom rclone command to load the models from S3 then overwrite the config attributes before running the server and pointing to the model repository that's now available locally

Having multiple copies of the model (and the configuration alongside it), but this is far from an ideal solution to be honest

Additional context I'm targeting a Kubernetes deployment where the Triton server is deployed using a custom Helm chart. It's common practice to provide hardware requirements when deploying the chart, and with it, I'm hoping to also pass attributes such as instance_group and have them reflected in my deployment :)

Thank you!

This is urgently needed.

Jun 30 '22 02:06 lfxx

Thank you for your detailed ticket, Ashraf. I filed an enhancement request for this.

Any update on this?

Jun 30 '22 02:06 lfxx

This is a great quality-of-life feature. I would also add the possibility to override the configuration via the load/unload API on the client side: client.load_model("mymodelname", config_override={"instance_group": ...)

Jun 30 '22 14:06 QMassoz

No updates. I've pinged those leading prioritization and mentioned your urgency. Keep in mind that there's not time to get this in for 22.07, so it would be included in 22.08 at the earliest. (Of course, the repo is public, so you'd be welcome to build with the PR yourself, if/once this feature is merged.)

Jun 30 '22 16:06 the-david-oy

@ashrafguitoni (and other users requesting this feature) are you looking to only modify the instance group via this functionality? if so Triton would need to re-work the model control workflow to ensure the same.

What sort of updates would you do? Can you give us examples of the same for us to better scope this feature request?

Jun 30 '22 18:06 CoderHam

In my case, the instance group is the most important in order to more easily manage the resources on a given machine. I could see use for overriding max_batch_size, ModelParameter, and ModelPriority but to a lesser extend.

Jul 01 '22 12:07 QMassoz

@ashrafguitoni (and other users requesting this feature) are you looking to only modify the instance group via this functionality? if so Triton would need to re-work the model control workflow to ensure the same.

@CoderHam Ideally, it wouldn't just be instance_groups because if at deployment time I'm overriding that value, I'd like also to specify things such as max_batch_size or even cc_model_filenames because those will depend on my deployment infrastructure.

Since I'm not familiar with the tritonserver codebase, I most likely underestimated how difficult it would be to implement my request. I imagined it would be just a matter of some text processing where the attributes of the override file would be inserted into the base config file provided inside the model repository (or generated by default), and once the config file is updated, the actual serving workflow would start.

What sort of updates would you do? Can you give us examples of the same for us to better scope this feature request?

As I mentioned in my original post. It would be great to have model checkpoints stored in S3 or similar just with basic configuration (model name, model platform, inputs, and outputs), then be able to create GPU deployments (where instance_groups is overridden to contain KIND_GPU) and CPU deployments (configured with KIND_GPU) using that same checkpoint/config pair.

In general, though, I think it's really helpful to be able to add or override any configuration attribute when starting the Triton server. Model checkpoint files (and perhaps a basic model config files) are written into object storage at training time when information about hardware and performance requirements aren't necessarily available. tritonserver is executed at serving time, when that information is available. This is why it would be great to have an easy way of overriding config attributes via a CLI flag!

Jul 02 '22 07:07 ashrafguitoni

Apologies for the late response, seeing this now. Triton supports passing in any configuration via the model load API.

Jul 08 '23 00:07 the-david-oy

server server copied to clipboard

Option for adding or overriding model config attributes at server startup

server
server copied to clipboard