superduper icon indicating copy to clipboard operation
superduper copied to clipboard

[DistEnv] All outputs loaded in memory before bulk write to database

Open kartik4949 opened this issue 1 year ago • 1 comments
trafficstars

When the model has predicted lets say for 1 million data points, in components/model.py : predict method

model stores the outputs of this 1 million data points into a single list outputs which will OOM when it exceeds memory.

refer : superduper/components/model.py: predict method.

Same thing happens in model inputs All inputs are loaded on memory before passing it to model, inputs are packed into a e.g Dataloader (refer: ext/torch/model.py: _predict method)

  • [x] #1627

We need to chunk the model inputs in the database and iterate over a chunk and pass it for model prediction.

kartik4949 avatar Dec 30 '23 19:12 kartik4949