Memory leaks on infer requests

Open darkestpigeon opened this issue 2 years ago • 1 comments

Describe the bug The server leaks memory on infer requests.

To Reproduce To reproduce

Get the archive ov-leak-debug.tar.gz, it contains a load simulation script, docker-compose file to run the server, the models and the model_config.json, and a script that creates the models (for reference).
Run the server with docker compose up -d.
Install the dependencies pip install -r requirements.txt.
Run the script python generate_ovms_load.py {model_name} --n-workers 10 --n-threads 10 with {model_name} being static, dynamic or dynamic-nms.
Check OVMS memory usage to see it creep up. It fluctuates for static, creeps up slowly for dynamic, and grows rapidly for dynamic-nms. The memory usage doesn't go down even when the load is no longer applied.

Expected behavior The memory used by OVMS is constant (or stabilizes after some time).

Logs Doesn't apply, no explicit errors present.

Configuration

OpenVINO Model Server 2023.1.d789fb785, OpenVINO backend 2023.1.0.12185.9e6b00e51cd
Checked on 12th Gen Intel(R) Core(TM) i9-12900H and on AMD Ryzen 9 5950X 16-Core Processor
Config and models included in the archive

Oct 21 '23 21:10 darkestpigeon

@darkestpigeon Thank you for your report. We have tested the models and scripts from attachments. This is 4 days (with some breaks) overview (docker container memory usage + RPS): workload_monitoring

How long was your testing workload?

Dec 07 '23 11:12 dkalinowski