tensorflow
tensorflow copied to clipboard
Significant difference in RSS memory usage between TF1 and TF2
Issue type
Bug
Have you reproduced the bug with TensorFlow Nightly?
Yes
Source
source
TensorFlow version
TF 2.13.1
Custom code
Yes
OS platform and distribution
Redhat Enterprise Linux 8.9
Mobile device
No response
Python version
3.11.4
Bazel version
5.4.0
GCC/compiler version
10.4
CUDA/cuDNN version
CUDA 12.2, cuDNN 8.9.5
GPU model and memory
A100 80GB
Current behavior?
When running the same keras workload on TF1 vs. TF2, I'm seeing a significant increase in memory utilization per epoch. This happens when using both CPUs/GPUs. The utilization for an epoch climbs up significantly after every epoch. See below for TF1 vs. TF2:
# For TF2
Memory usage after epoch 0 [mem_usage = 3.41 GB]
Memory usage after epoch 1 [mem_usage = 3.88 GB]
Memory usage after epoch 2 [mem_usage = 3.88 GB]
Memory usage after epoch 3 [mem_usage = 4.32 GB]
Memory usage after epoch 4 [mem_usage = 4.81 GB]
Memory usage after epoch 5 [mem_usage = 5.26 GB]
Memory usage after epoch 6 [mem_usage = 5.70 GB]
Memory usage after epoch 7 [mem_usage = 6.14 GB]
Memory usage after epoch 8 [mem_usage = 6.70 GB]
Memory usage after epoch 9 [mem_usage = 7.15 GB]
Memory usage after epoch 10 [mem_usage = 7.36 GB]
Memory usage after epoch 11 [mem_usage = 7.36 GB]
Memory usage after epoch 12 [mem_usage = 7.36 GB]
Memory usage after epoch 13 [mem_usage = 7.37 GB]
Memory usage after epoch 14 [mem_usage = 7.37 GB]
Memory usage after epoch 15 [mem_usage = 7.37 GB]
Memory usage after epoch 16 [mem_usage = 7.37 GB]
Memory usage after epoch 17 [mem_usage = 7.59 GB]
Memory usage after epoch 18 [mem_usage = 7.81 GB]
Memory usage after epoch 19 [mem_usage = 7.81 GB]
# For TF1
Memory usage after epoch 0 [mem_usage = 5.13 GB]
Memory usage after epoch 1 [mem_usage = 5.14 GB]
Memory usage after epoch 2 [mem_usage = 5.14 GB]
Memory usage after epoch 3 [mem_usage = 5.15 GB]
Memory usage after epoch 4 [mem_usage = 5.15 GB]
Memory usage after epoch 5 [mem_usage = 5.15 GB]
Memory usage after epoch 6 [mem_usage = 5.15 GB]
Memory usage after epoch 7 [mem_usage = 5.15 GB]
Memory usage after epoch 8 [mem_usage = 5.15 GB]
Memory usage after epoch 9 [mem_usage = 5.15 GB]
Memory usage after epoch 10 [mem_usage = 5.15 GB]
Memory usage after epoch 11 [mem_usage = 5.15 GB]
Memory usage after epoch 12 [mem_usage = 5.15 GB]
Memory usage after epoch 13 [mem_usage = 5.15 GB]
Memory usage after epoch 14 [mem_usage = 5.15 GB]
Memory usage after epoch 15 [mem_usage = 5.15 GB]
Memory usage after epoch 16 [mem_usage = 5.15 GB]
Memory usage after epoch 17 [mem_usage = 5.15 GB]
Memory usage after epoch 18 [mem_usage = 5.15 GB]
Memory usage after epoch 19 [mem_usage = 5.15 GB]
Standalone code to reproduce the issue
import tensorflow as tf
import psutil
import time
import os
def mem_usage_str():
process = psutil.Process(os.getpid())
gb = process.memory_info().rss / (1024.**3)
return ' [mem_usage = {:5.2f} GB]'.format(gb)
if int(tf.__version__.split('.')[0]) < 2:
"""
Patch to fix TF/numpy1.20 compatibility issue
"""
from tensorflow.math import reduce_prod
from tensorflow.python.ops import array_ops
def _constant_if_small(value, shape, dtype, name):
try:
if reduce_prod(shape) < 1000: # monkey patch
return array_ops.constant(value, shape=shape, dtype=dtype,
name=name)
except TypeError:
# Happens when shape is a Tensor, list with Tensor elements, etc.
pass
return None
array_ops._constant_if_small = _constant_if_small
"""
End of patch
"""
def build_model():
inputs = [tf.keras.layers.Input(shape=(300, 6), name='input_layer')]
current_layer = inputs[0]
current_layer = tf.keras.layers.LSTM(
50,
dropout=0.1,
recurrent_dropout=0.1,
return_sequences=False,
name='lstm',
)(current_layer)
current_layer = tf.keras.layers.Dense(1)(current_layer)
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)
model = tf.keras.models.Model(inputs=inputs, outputs=current_layer)
model.compile(loss='mse', optimizer=optimizer)
return model
def run(model, X, y, n_epochs):
tot_time = 0.
print(f'Memory usage before training' + mem_usage_str())
for i in range(n_epochs):
start = time.time()
model.fit(X, y, epochs=1, batch_size=4096, verbose=0)
tot_time += time.time() - start
print(f'Memory usage after epoch {i}' + mem_usage_str())
print(f'Avg. time = {tot_time / n_epochs} seconds')
def run_example(p, n_epochs):
import numpy as np
model = build_model()
X = np.random.randn(2 ** p, 300, 6)
y = np.random.randn(2 ** p)
run(model, X, y, n_epochs)
def main():
run_example(
16, # 2 ** 16 samples
20, # 10 epochs
)
# ------------------------------------------------------------------------------
if __name__ == "__main__":
main()
Relevant log output
No response
Is it possible to have it both as a tagged template and as a "namespace" with the runEventLoop and other classes and interface?
I don't believe that is possible. It would have to be
import { slint }, * as the_rest from "slint-ui";
let instance = slint`...`;
I thought about this but I'm not sure anymore it's really worth the "convenience", for two reasons:
- We can't make it type safe.
- It's not an idiomatic use of template literals. We wouldn't really make use of templating after all - unless we add something like "inline" javascript handlers. But that brings in additional complications for the tooling.
If we want to have a way of creating a component instance from just a string, why don't we use a regular function?
let instance = slint.createInstanceFromString(export component App { ... });