generative-models icon indicating copy to clipboard operation
generative-models copied to clipboard

Using CPU as the default device in the Discretizer to avoid wasting memory in the rank 0 GPU

Open ivanprado opened this issue 1 year ago • 0 comments

Using CPU as the default device.

The models are initialized way before the init_process_group() is invoked. The Discretizer is placed in the GPU at initialization, and at this stage, it is always the first GPU in the node. Therefore some CUDA buffers are initialized and not liberated when the model is placed in the right place. Initializing the Discretizer in the CPU fixes the problem.

ivanprado avatar Jun 29 '23 14:06 ivanprado