superduper
superduper copied to clipboard
[DistEnv] All outputs loaded in memory before bulk write to database
trafficstars
When the model has predicted lets say for 1 million data points, in components/model.py : predict method
model stores the outputs of this 1 million data points into a single list outputs which will OOM when it exceeds memory.
refer : superduper/components/model.py: predict method.
Same thing happens in model inputs All inputs are loaded on memory before passing it to model, inputs are packed into a e.g Dataloader (refer: ext/torch/model.py: _predict method)
- [x] #1627
We need to chunk the model inputs in the database and iterate over a chunk and pass it for model prediction.