waymax icon indicating copy to clipboard operation
waymax copied to clipboard

Out of memory when use the "next(self.scenario_generator)"

Open AlniyatRui opened this issue 1 year ago • 1 comments

Hi, Very Nice Work!!

I'm currently working on creating a simple architecture using imitation learning. However, I found an issue using the following code to create my dataset. Specifically, when I use "data = next(self.scenario_generator)", I notice that GPU memory usage increases rapidly (1G --> 20G+). I’m not sure how to resolve this problem.

I would appreciate any help, Thanks!

Here is my dataprocess code, I tried to use some methods like this "torch.from_numpy(jax.device_get(log_trajectory.yaw))" to deal with the scenario data.

def custom_collate_fn(batch):
    ...
    return batch_data

class ScenarioDataset(torch.utils.data.Dataset):
    def __init__(self, scenario_generator):
        self.scenario_generator = scenario_generator
        self.num_samples = 487001
        
    def __len__(self):
        return self.num_samples

    def __getitem__(self, idx):
        data = next(self.scenario_generator)
        return data 

AlniyatRui avatar Dec 11 '24 08:12 AlniyatRui

My GPU is A30 with 24GB memory, and I try to use the following code to avoid the out-of-memory, but it still occupies about 18GB.

import tensorflow as tf
physical_devices = tf.config.list_physical_devices('GPU')
if physical_devices:
    for device in physical_devices:
        tf.config.experimental.set_memory_growth(device, True)
else:
    print("No GPU found!")

thanks for any help.

AlniyatRui avatar Dec 11 '24 14:12 AlniyatRui

TensorFlow will by default allocate ~all of your GPU memory. See eg https://stackoverflow.com/questions/34199233/how-to-prevent-tensorflow-from-allocating-the-totality-of-a-gpu-memory for more info.

samuela avatar Jun 17 '25 21:06 samuela