server icon indicating copy to clipboard operation
server copied to clipboard

Allow loading/unloading of specific version for a given model

Open mmischitelli opened this issue 2 years ago • 9 comments

Is your feature request related to a problem? Please describe. I find the model control mode "explicit" to be very interesting. Although, for my usecase, it pretty much useless as we have few models consisting of tens or even hundred of versions. Each model is basically updated weekly and previous versions must be kept available.

Describe the solution you'd like A new overload to the LoadModel method, accepting not only the model name, but also the version.

Describe alternatives you've considered An additional API, such as LoadVersion, which still sports two arguments: model name and version.

Additional context My aim here is basically keeping the memory usage under control. I don't want to load all the versions of a given model in memory as it would clog to death my client, as well as requiring my server to have terabytes of Ram.

mmischitelli avatar May 20 '22 14:05 mmischitelli

I've filed a ticket for this feature request.

I don't know your specific use case, but you may also be able to use model names in the meantime (e.g. model100_2022_05_23 or whatever makes sense for you). Another option is leveraging the specific version policy. We've added this to our queue to review.

dyastremsky avatar May 23 '22 21:05 dyastremsky

Thanks!

Unfortunately we cannot "flatten" the model/version hierarchy as we have some strict requirements that would make things way too complicated. For instance, we plot the inference results on a chart: if the function's name is the same we can esily do comparisons, as the version is just a parameter.

The specific version policy instead is kinda "static". Our solution provides users with the backend to run simulations on their own, eventually comparing data from different years. Since we produce model versions weekly, it would result in way too much memory being occupied for no real reason (except for being able to serve all the versions available).

If there was support for an on-demand load/unload of versions, we would then be able to implement some kind of LRU policy for managing the available resources.

mmischitelli avatar May 24 '22 10:05 mmischitelli

@mmischitelli combining the specific version policy with the control mode POLL is quite fast in updating the available models, right?

@dyastremsky My specific use-case will be controlling the model versioning through the API within the KServe modelmesh environment. I will have to find out if either workaround will work. Where can I keep an eye on the progress of the feature request ticket? Would you happen to have an indication of the timeline for such a feature? Thanks for the information!

Lingkar avatar May 24 '22 14:05 Lingkar

@Lingkar The ticket was filed internally. There is no timeline yet for this feature request. Your best bet is to keep an eye on this GitHub issue, as I'm leaving it open as we look into this feature request.

dyastremsky avatar May 24 '22 14:05 dyastremsky

@mmischitelli combining the specific version policy with the control mode POLL is quite fast in updating the available models, right?

yes, it's just the time needed to load any new version that is added to the local repo. Usually it's quite fast.

mmischitelli avatar May 25 '22 14:05 mmischitelli

Now that in model load API, you can specify a string representation of the config file as part of the request, which may address your use case, in the override config, you can specify different version policy based on the circumstances to control which subset of the versions that should be loaded. Do you think it could be helpful?

GuanLuo avatar Sep 03 '22 00:09 GuanLuo

@GuanLuo Sounds excellent for my use case! I will give it a try, thanks! :+1:

Lingkar avatar Sep 07 '22 13:09 Lingkar

Hi, not particularly in my scenario. I need to be able to specify which version to load, not the policy.

In particular, I need to be able to ask the server for version "3" or version "27" for any given model.

Now that in model load API, you can specify a string representation of the config file as part of the request, which may address your use case, in the override config, you can specify different version policy based on the circumstances to control which subset of the versions that should be loaded. Do you think it could be helpful?

mmischitelli avatar Sep 07 '22 15:09 mmischitelli

I believe you can send the request like the following to load the specified version (take "3" as example):

POST /v2/repository/models/mymodel/load HTTP/1.1
Host: localhost:8000
{
  "parameters": {
    "config": "{
        ... # other config fields
        "version_policy": {"specific" : { "versions" : [ 3 ] }}
    }"
  }
}

GuanLuo avatar Sep 08 '22 19:09 GuanLuo

@GuanLuo your solution does not work even with latest triton image 22.08

bangpc avatar Sep 27 '22 10:09 bangpc

What specifically does not work, Bang? What are you sending, what are you getting back?

dyastremsky avatar Sep 27 '22 16:09 dyastremsky

@dyastremsky my cfg that I send to Triton

{
    "parameters": {
        "config": {
            "name": "fisheye",              
            "platform": "onnxruntime_onnx", 
            "default_model_filename": "model.onnx",
            "input": [
                {
                    "name": "input",                
                    "data_type": "FP32",            
                    "dims": [1, 3, 608, 608]    
                }
            ],
            "output": [
                {
                    "name": "output",               
                    "data_type": "FP32",            
                    "dims": [1, 22743, 6]       
                }
            ],
            "version_policy": {"specific": {"versions": [1]}}
        }
    }
}

I also tried to use config field as string as commented by GuanLuo I load model using the following function triton_client.load_model(model_name, config=json.dumps(cfg))

Result: Triton does not change anything, it will unload and reload the latest version in model directory

bangpc avatar Sep 28 '22 01:09 bangpc

Are you using EXPLICIT model control mode?

dyastremsky avatar Sep 28 '22 17:09 dyastremsky

@dyastremsky yep, I use explicit model control mode And I have one more question about how can I unload specific version. I alreadry read the unload function but I do not know how to do this

bangpc avatar Sep 29 '22 01:09 bangpc

Ah, you're using the model config override. We're currently working through a bug whereby if you override the model config but not the model, the model config override is not recognized. We're working on this and aiming to get a fix in soon.

We try to keep issues focused on the original poster's question. If you have other questions or issues, please open a new issue. (Unload API documentation here.)

CC: @krishung5

dyastremsky avatar Oct 03 '22 18:10 dyastremsky

@bangpc It looks like the config you provided isn't correct. The parameters and config fields shouldn't be part of the config. I think @GuanLuo added those for a POST request, but you're sending it via the client API's load_model.

If you send the correct config, it should work. If not, please open a separate issue and we'll investigate.

Closing issue due to the original issue being resolved. Please feel free to follow up and ask us to reopen if there are further questions about the loading/unloading of specific model versions.

dyastremsky avatar Oct 04 '22 22:10 dyastremsky