textgenrnn High CPU usage, low GPU usage

Hi there

When I train textgenrnn on a text file it seems to progress fairly slowly (28ms/step), with high CPU usage (>40%), and low GPU usage (c.10%). As I've got a fairly beefy GPU, a 2070, I'd have expected faster performance with the GPU taking more of the load. Is there any option to pass more of the work onto the GPU? It's recognized by tensorflow, and I've got CUDA and CuDNN installed.

Thanks in advance

Apr 01 '20 15:04 KitEJohnson

Hi, what parameters are you using? Some parameters like batchsize needs to be increased for the GPU to be able to stretch it legs

Apr 01 '20 17:04 ZerxXxes

Hi there.

I've upped the batchsize and this seems to have resolved the problem. Currently sitting at 2048 and far faster than before. Are there diminishing returns at some point for increases here?

Apr 01 '20 17:04 KitEJohnson

You can modify 3 lines of code to enabled Mixed Precision in Tensorflow, this will make your RTX2070 to also use its Tensorcores which for me (on a RTX2080Ti) gave about 2.2x speed increase.

you need to add these two lines right at the start of model.py

def textgenrnn_model(num_classes, cfg, context_size=None,
                     weights_path=None,
                     dropout=0.0,
                     optimizer=Adam(lr=4e-3)):
    '''
    Builds the model architecture for textgenrnn and
    loads the specified weights for the model.
    '''

    policy = mixed_precision.Policy('mixed_float16')
    mixed_precision.set_policy(policy)

    input = Input(shape=(cfg['max_length'],), name='input')

And further down in the same file you need to set the dtype: output = Dense(num_classes, name='output', dtype='float32', activation='softmax')(attention)

Tensorcores can only be used when your models parameters are multiples of 8 so you need to also change some values from default when you train a new model. Both max_length and dim_embeddings needs to be changed (from 40 and 100 which are default) to something that is multiple of 8, like 32 and 128 when you create your model.

Apr 01 '20 18:04 ZerxXxes

Thanks! I've also had the following error after training:

tensorflow/core/kernels/data/generator_dataset_op.cc:103] Error occurred when finalizing GeneratorDataset iterator: Cancelled: Operation was cancelled

Would there be a fix for this?

Apr 01 '20 18:04 KitEJohnson

Yes, I belive this is a bug for Keras sequence multi-processing implementation. Its fixed in tensorflow-nightly builds and will be fixed in tensorflow 2.2 https://github.com/tensorflow/tensorflow/issues/35100

Apr 01 '20 18:04 ZerxXxes

Thank you for the quick response! I've edited model.py, but don't see any speed increase (still about 280ms/step). Would there be any particular reason for this?

I've attached my model.py in case I didn't correctly edit it. model.txt

Apr 01 '20 18:04 KitEJohnson

Hey, you pasted to much, only the two rows should be inserted in to the code

    policy = mixed_precision.Policy('mixed_float16')
    mixed_precision.set_policy(policy)

The other lines was just for orientation on where to paste them.

Apr 01 '20 19:04 ZerxXxes

Thanks again. Now when I run textgen = textgenrnn() I get the error name:

NameError: name 'mixed_precision' is not defined

Don't suppose there's something obvious I'm missing?

Apr 01 '20 21:04 KitEJohnson

Ah, no its my fault. I forgot you also need to import mixed precision support. at the top of the file with all the other imports, add: from tensorflow.keras.mixed_precision import experimental as mixed_precision

Apr 02 '20 05:04 ZerxXxes

it wont run on gpu for me with batch size 8096

Jun 02 '20 00:06 test1230-lab

it's cpu only. not everything is gpu

Jul 06 '22 00:07 breadbrowser