trankit Memory leak in Pipeline() on a CPU

Memory leak in Pipeline() on a CPU

Open navotoz opened this issue 2 years ago • 3 comments

Hello,

I've initiated the model like so: nlp = Pipeline('english', gpu=False, cache_dir='./cache') Than call it by using: with torch.no_grad(): for idx in range(10000): nlp.lemmatize('Hello World', is_sent=True). When running the code, the RAM memory slowly fills.

I attached a graph of the memory filling up.

I'm using python3.7, trankit=1.1.0, torch=1.7.1.

Thank you!

May 31 '22 11:05 navotoz

I confirm: when running on CPU there is an increasing memory consumption. @navotoz , could you, please, tell me whether you have been able to solve this issue?

Sep 02 '22 10:09 olegpolivin

Hi @navotoz ， I confirm this issue also appears in Python 3.7, trankit 1.1.1, torch 1.8.1+cu101

Sep 05 '22 01:09 Dielianss

Hi @Dielianss @olegpolivin Thanks for the comments. We maneged to mitigate this issue by running inference in a docker and restarting it every predefined interval. This is not a real solution to this issue, but at least we can work with the model.

Sep 05 '22 02:09 navotoz

trankit trankit copied to clipboard

Memory leak in Pipeline() on a CPU

trankit
trankit copied to clipboard