serve
serve copied to clipboard
/models returns empty list of models when models is specified
🐛 Describe the bug
Please note that this is related to https://github.com/pytorch/serve/blob/master/docs/configuration.md#config-model
I have a config file in the config section below.
In the serve logs I can see my model "newmodel" and its config, but it's not loaded. Hitting :8085/models shows no models loaded, which is expected because there are no models loaded (default is N/A for load_models)
If I try to hit /v1/newmodel:predict it tells me that model is not found. How would I load the model then?
According to the doc I should specify my models in models= but according to the examples and from what I've seen used everywhere I actually have to provide my models in model_snapshot={....., "models": {}} which is not mentioned in the doc file above?
If I specify load_models=all then all of the models are loaded and I can see them in /models but it doesn't name it newmodel instead it names it testmodel, as the name of the file is. I don't believe this is intended behavior and if it is then the behavior is not consistent with docs.
Error logs
model-server@e4f66a30f331:~$ torchserve --start --model-store model_store --ts-config config.properties
Removing orphan pid file.
model-server@e4f66a30f331:~$ WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.
2022-08-03T18:49:17,361 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager - Initializing plugins manager...
2022-08-03T18:49:17,513 [INFO ] main org.pytorch.serve.ModelServer -
Torchserve version: 0.6.0
TS Home: /home/venv/lib/python3.8/site-packages
Current directory: /home/model-server
Temp directory: /home/model-server/tmp
Number of GPUs: 0
Number of CPUs: 48
Max heap size: 30688 M
Python executable: /home/venv/bin/python
Config file: config.properties
Inference address: http://0.0.0.0:8085
Management address: http://0.0.0.0:8085
Metrics address: http://0.0.0.0:8082
Model Store: /home/model-server/model_store
Initial Models: N/A
Log dir: /home/model-server/logs
Metrics dir: /home/model-server/logs
Netty threads: 4
Netty client threads: 0
Default workers per model: 48
Blacklist Regex: N/A
Maximum Response Size: 6553500
Maximum Request Size: 6553500
Limit Maximum Image Pixels: true
Prefer direct buffer: false
Allowed Urls: [file://.*|http(s)?://.*]
Custom python dependency for model allowed: true
Metrics report format: prometheus
Enable metrics API: true
Workflow Store: /home/model-server/model_store
Model config: {"newmodel":{"1.0":{"defaultVersion":true,"marName":"testmodel.mar","minWorkers":1,"maxWorkers":5,"batchSize":1,"maxBatchDelay":5000,"responseTimeout":120}}}
2022-08-03T18:49:17,521 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager - Loading snapshot serializer plugin...
2022-08-03T18:49:17,526 [INFO ] main org.pytorch.serve.ModelServer - Initialize Inference server with: EpollServerSocketChannel.
2022-08-03T18:49:17,607 [INFO ] main org.pytorch.serve.ModelServer - Inference API bind to: http://0.0.0.0:8085
2022-08-03T18:49:17,607 [INFO ] main org.pytorch.serve.ModelServer - Initialize Metrics server with: EpollServerSocketChannel.
2022-08-03T18:49:17,609 [INFO ] main org.pytorch.serve.ModelServer - Metrics API bind to: http://0.0.0.0:8082
Model server started.
2022-08-03T18:49:17,956 [INFO ] pool-3-thread-1 TS_METRICS - CPUUtilization.Percent:0.0|#Level:Host|#hostname:e4f66a30f331,timestamp:1659552557
2022-08-03T18:49:17,958 [INFO ] pool-3-thread-1 TS_METRICS - DiskAvailable.Gigabytes:816.2108383178711|#Level:Host|#hostname:e4f66a30f331,timestamp:1659552557
2022-08-03T18:49:17,959 [INFO ] pool-3-thread-1 TS_METRICS - DiskUsage.Gigabytes:11469.123527526855|#Level:Host|#hostname:e4f66a30f331,timestamp:1659552557
2022-08-03T18:49:17,959 [INFO ] pool-3-thread-1 TS_METRICS - DiskUtilization.Percent:93.4|#Level:Host|#hostname:e4f66a30f331,timestamp:1659552557
2022-08-03T18:49:17,960 [INFO ] pool-3-thread-1 TS_METRICS - MemoryAvailable.Megabytes:168068.890625|#Level:Host|#hostname:e4f66a30f331,timestamp:1659552557
2022-08-03T18:49:17,961 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUsed.Megabytes:86343.0625|#Level:Host|#hostname:e4f66a30f331,timestamp:1659552557
2022-08-03T18:49:17,961 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUtilization.Percent:34.8|#Level:Host|#hostname:e4f66a30f331,timestamp:1659552557
Installation instructions
docker run --rm -it -p 8085:8085 -v $(pwd):/home/model-server/ pytorch/torchserve bash
Model Packaing
default handler
config.properties
inference_address=http://0.0.0.0:8085
management_address=http://0.0.0.0:8085
metrics_address=http://0.0.0.0:8082
enable_metrics_api=true
metrics_format=prometheus
number_of_netty_threads=4
job_queue_size=10
# service_envelope=kfserving
model_store=/mnt/models/model-store
models={"newmodel":{"1.0":{"defaultVersion":true,"marName":"testmodel.mar","minWorkers":1,"maxWorkers":5,"batchSize":1,"maxBatchDelay":5000,"responseTimeout":120}}}
#model_snapshot={"name":"startup.cfg","modelCount":1,"models":{"newmodel":{"1.0":{"defaultVersion":true,"marName":"testmodel.mar","minWorkers":1,"maxWorkers":5,"batchSize":1,"maxBatchDelay":5000,"responseTimeout":120}}}}
install_py_dep_per_model=true
Versions
TorchServe Version is 0.6.0
Repro instructions
torchserve --start --model-store model_store --ts-config config.properties
Possible Solution
No response
Please note that I am able to run the model properly when model_snapshot is provided
model_snapshot={
"name": "startup.cfg",
"modelCount": 1,
"models": {
"newmodel": {
"1.0": {
"defaultVersion": true,
"marName": "testmodel.mar",
"minWorkers": 1,
"maxWorkers": 5,
"batchSize": 1,
"maxBatchDelay": 5000,
"responseTimeout": 120
}
}
}
}
@ridhwan-saal in config.properties, "models": is used to provide parameters for model loading. "load-models": is used to specify which model needs to be loaded during torchserve start.
you can either:
- run: model-server@e4f66a30f331:~$ torchserve --start --model-store model_store --ts-config config.properties --models all
or
- add "load-model=all" in config.properties
@lxning what if I want to change what models are loaded while serve is already running? Do I have to stop it and open it again?
@ridhwan-saal you can change model configuration via rest-api
I tried load model all and it is loading the same model 10 times before crashing, even though I have 1 model only specified. @lxning
I tried load model all and it is loading the same model 10 times before crashing, even though I have 1 model only specified. @lxning
@ridhwan-saal could you please provide the following information and log?
- how many models are there in model_store dir?
- what is the command to start torchserve?
- what is the config.properties