High CPU usage, low GPU usage
Hi there
When I train textgenrnn on a text file it seems to progress fairly slowly (28ms/step), with high CPU usage (>40%), and low GPU usage (c.10%). As I've got a fairly beefy GPU, a 2070, I'd have expected faster performance with the GPU taking more of the load. Is there any option to pass more of the work onto the GPU? It's recognized by tensorflow, and I've got CUDA and CuDNN installed.
Thanks in advance
Hi, what parameters are you using? Some parameters like batchsize needs to be increased for the GPU to be able to stretch it legs
Hi there.
I've upped the batchsize and this seems to have resolved the problem. Currently sitting at 2048 and far faster than before. Are there diminishing returns at some point for increases here?
You can modify 3 lines of code to enabled Mixed Precision in Tensorflow, this will make your RTX2070 to also use its Tensorcores which for me (on a RTX2080Ti) gave about 2.2x speed increase.
you need to add these two lines right at the start of model.py
def textgenrnn_model(num_classes, cfg, context_size=None,
weights_path=None,
dropout=0.0,
optimizer=Adam(lr=4e-3)):
'''
Builds the model architecture for textgenrnn and
loads the specified weights for the model.
'''
policy = mixed_precision.Policy('mixed_float16')
mixed_precision.set_policy(policy)
input = Input(shape=(cfg['max_length'],), name='input')
And further down in the same file you need to set the dtype:
output = Dense(num_classes, name='output', dtype='float32', activation='softmax')(attention)
Tensorcores can only be used when your models parameters are multiples of 8 so you need to also change some values from default when you train a new model. Both max_length and dim_embeddings needs to be changed (from 40 and 100 which are default) to something that is multiple of 8, like 32 and 128 when you create your model.
Thanks! I've also had the following error after training:
tensorflow/core/kernels/data/generator_dataset_op.cc:103] Error occurred when finalizing GeneratorDataset iterator: Cancelled: Operation was cancelled
Would there be a fix for this?
Yes, I belive this is a bug for Keras sequence multi-processing implementation. Its fixed in tensorflow-nightly builds and will be fixed in tensorflow 2.2 https://github.com/tensorflow/tensorflow/issues/35100
Thank you for the quick response! I've edited model.py, but don't see any speed increase (still about 280ms/step). Would there be any particular reason for this?
I've attached my model.py in case I didn't correctly edit it. model.txt
Hey, you pasted to much, only the two rows should be inserted in to the code
policy = mixed_precision.Policy('mixed_float16')
mixed_precision.set_policy(policy)
The other lines was just for orientation on where to paste them.
Thanks again. Now when I run textgen = textgenrnn() I get the error name:
NameError: name 'mixed_precision' is not defined
Don't suppose there's something obvious I'm missing?
Ah, no its my fault.
I forgot you also need to import mixed precision support.
at the top of the file with all the other imports, add:
from tensorflow.keras.mixed_precision import experimental as mixed_precision
it wont run on gpu for me with batch size 8096
it's cpu only. not everything is gpu