nobrainer suggested refactoring to avoid OOM errors

suggested refactoring to avoid OOM errors

Open hvgazula opened this issue 10 months ago • 1 comments

https://github.com/neuronets/nobrainer/blob/cb855feaadd4ac354e1e2d1c760a649df3f61ab4/nobrainer/dataset.py#L249-L250

Suggestion:

def scalar_labels(self):
        temp_ds = self.dataset.map(
            lambda _, y: tf.experimental.numpy.isscalar(y),
            deterministic=False,
            num_parallel_calls=AUTOTUNE,
        )
        return tf.math.reduce_all(list(temp_ds.as_numpy_iterator())).numpy()

Notes:

The previous snippet collects all label volumes into a list (this is a memory hog and hence the reason for OOM) and then applies _labels_all_scalar.
Refactored the snippet to map each label volume into a isscalar function which returns a bool flag. Subsequently, the collected bool flags are reduced to one final bool flag.
The other (naive) approach is to run nobrainer.tfrecord._is_int_or_float() on each element of the dataset (in a for loop) and then reduce all the bool flags (similar to step 2).
I am unsure why the GPU utilization is non-zero during this operation.
I still maintain that the repeat should be delayed until after this operation. Otherwise, the entire repeated dataset will be used for this operation and is undesirable.

Caveat:

Used a tf.experimental function which may (or may not) be deprecated in the future. @satra what are your thoughts on using experimental features in the nobrainer API?

Apr 01 '24 18:04 hvgazula

nobrainer nobrainer copied to clipboard

suggested refactoring to avoid OOM errors

nobrainer
nobrainer copied to clipboard