ncps icon indicating copy to clipboard operation
ncps copied to clipboard

Getting a weird error

Open UsernameNotFound403 opened this issue 3 years ago • 3 comments

When implementing the LTCcell like in the examples I get an error like this: 2022-01-25 20:03:29.675637: F tensorflow/core/framework/tensor.cc:681] Check failed: IsAligned() ptr = 0x29613b360 My implementation looks like this:

wiring = kncp.wirings.FullyConnected(nn1lstm, 64)
ltc_cell = LTCCell(wiring)
model = Sequential()
model.add(RNN(ltc_cell, return_sequences=True, input_shape=(config.num_steps, config.input_size)))
model.add(Dense(config.output_size))
model.compile(optimizer=optimizer, loss=loss_function, metrics=['Accuracy'])
model.build()

I am using Tensorflow-macos v2.7 on Mac M1 with Metal What can I do?

UsernameNotFound403 avatar Jan 25 '22 19:01 UsernameNotFound403

This does not seem to a bug in keras-ncp but may originate somewhere else, because I cannot reproduce the error in colab: https://colab.research.google.com/drive/10101d3RuRTJqStSlJpTcPtFKQFCkgBjQ?usp=sharing

mlech26l avatar Jan 26 '22 10:01 mlech26l

Yes you are right. I found a post about this error and that the metal plugin is the problem. I deleted the Tensorflow-metal package and it's working but my model would need 2 hours for each Epoch. My LTCCell has 17898 trainable parameters and in total there are 20000. Is there any way that this could be patched and that it will work with Metal?

UsernameNotFound403 avatar Jan 26 '22 15:01 UsernameNotFound403

I have no knowledge about TF with Metal. Colab (w and w/o GPU) is a good starting point to get an estimate of what training times to expect.

The training times of LTCs are usually much longer (~10-100x) than standard RNN modules due to the use of an ODE solver.

mlech26l avatar Jan 28 '22 13:01 mlech26l