server
server copied to clipboard
Option for adding or overriding model config attributes at server startup
Is your feature request related to a problem? Please describe. When using Triton server in a deployment with variable hardware configurations (CPU-only for some environments, using GPUs for others) and models stored in S3, we might have to create multiple copies of the model repository just to have multiple versions of the configuration file for each, where we can configure the hardware we want to use for each deployment.
Describe the solution you'd like It would be very helpful to have a config override flag like this:
tritonserver --model-store=s3://my-bucket/my-model-repository --config-override=/mnt/config.pbtxt
This way, I can put most attributes such as the model platform and the inputs/outputs in the config.pbtxt stored in S3, then put attributes such as the instance groups and batch size configuration in the config.pbtxt provided to the server at deployment time.
Describe alternatives you've considered Alternative solutions I've considered include:
- Adding a custom
rclonecommand to load the models from S3 then overwrite the config attributes before running the server and pointing to the model repository that's now available locally - Having multiple copies of the model (and the configuration alongside it), but this is far from an ideal solution to be honest
Additional context
I'm targeting a Kubernetes deployment where the Triton server is deployed using a custom Helm chart. It's common practice to provide hardware requirements when deploying the chart, and with it, I'm hoping to also pass attributes such as instance_group and have them reflected in my deployment :)
Thank you!
Thank you for your detailed ticket, Ashraf. I filed an enhancement request for this.
Is your feature request related to a problem? Please describe. When using Triton server in a deployment with variable hardware configurations (CPU-only for some environments, using GPUs for others) and models stored in S3, we might have to create multiple copies of the model repository just to have multiple versions of the configuration file for each, where we can configure the hardware we want to use for each deployment.
Describe the solution you'd like It would be very helpful to have a config override flag like this:
tritonserver --model-store=s3://my-bucket/my-model-repository --config-override=/mnt/config.pbtxtThis way, I can put most attributes such as the model platform and the inputs/outputs in the
config.pbtxtstored in S3, then put attributes such as the instance groups and batch size configuration in theconfig.pbtxtprovided to the server at deployment time.Describe alternatives you've considered Alternative solutions I've considered include:
- Adding a custom
rclonecommand to load the models from S3 then overwrite the config attributes before running the server and pointing to the model repository that's now available locally- Having multiple copies of the model (and the configuration alongside it), but this is far from an ideal solution to be honest
Additional context I'm targeting a Kubernetes deployment where the Triton server is deployed using a custom Helm chart. It's common practice to provide hardware requirements when deploying the chart, and with it, I'm hoping to also pass attributes such as
instance_groupand have them reflected in my deployment :)Thank you!
This is urgently needed.
Thank you for your detailed ticket, Ashraf. I filed an enhancement request for this.
Any update on this?
This is a great quality-of-life feature. I would also add the possibility to override the configuration via the load/unload API on the client side: client.load_model("mymodelname", config_override={"instance_group": ...)
No updates. I've pinged those leading prioritization and mentioned your urgency. Keep in mind that there's not time to get this in for 22.07, so it would be included in 22.08 at the earliest. (Of course, the repo is public, so you'd be welcome to build with the PR yourself, if/once this feature is merged.)
@ashrafguitoni (and other users requesting this feature) are you looking to only modify the instance group via this functionality? if so Triton would need to re-work the model control workflow to ensure the same.
What sort of updates would you do? Can you give us examples of the same for us to better scope this feature request?
In my case, the instance group is the most important in order to more easily manage the resources on a given machine. I could see use for overriding max_batch_size, ModelParameter, and ModelPriority but to a lesser extend.
@ashrafguitoni (and other users requesting this feature) are you looking to only modify the instance group via this functionality? if so Triton would need to re-work the model control workflow to ensure the same.
@CoderHam Ideally, it wouldn't just be instance_groups because if at deployment time I'm overriding that value, I'd like also to specify things such as max_batch_size or even cc_model_filenames because those will depend on my deployment infrastructure.
Since I'm not familiar with the tritonserver codebase, I most likely underestimated how difficult it would be to implement my request. I imagined it would be just a matter of some text processing where the attributes of the override file would be inserted into the base config file provided inside the model repository (or generated by default), and once the config file is updated, the actual serving workflow would start.
What sort of updates would you do? Can you give us examples of the same for us to better scope this feature request?
As I mentioned in my original post. It would be great to have model checkpoints stored in S3 or similar just with basic configuration (model name, model platform, inputs, and outputs), then be able to create GPU deployments (where instance_groups is overridden to contain KIND_GPU) and CPU deployments (configured with KIND_GPU) using that same checkpoint/config pair.
In general, though, I think it's really helpful to be able to add or override any configuration attribute when starting the Triton server. Model checkpoint files (and perhaps a basic model config files) are written into object storage at training time when information about hardware and performance requirements aren't necessarily available. tritonserver is executed at serving time, when that information is available. This is why it would be great to have an easy way of overriding config attributes via a CLI flag!
Apologies for the late response, seeing this now. Triton supports passing in any configuration via the model load API.