TensorFlow.NET icon indicating copy to clipboard operation
TensorFlow.NET copied to clipboard

Neither LSTM or RNN do not work (different errors)

Open erichiller opened this issue 3 years ago • 4 comments

I was running through the examples , particularly DigitRecognitionLSTM and DigitRecognitionRNN as well as the Keras version and am unable to get any of these to work. The CNN example for image processing does work. I can open multiple issues if you'd like, as they are quite different.

The below 3 issues are straight copies of the aforementioned examples.

DigitRecognitionRNN

Unhandled exception. System.ArgumentNullException: Value cannot be null. (Parameter 'first')
   at System.Linq.ThrowHelper.ThrowArgumentNullException(ExceptionArgument argument)
   at System.Linq.Enumerable.Concat[TSource](IEnumerable`1 first, IEnumerable`1 second)
   at Tensorflow.Optimizer.compute_gradients(Tensor loss, List`1 var_list, Nullable`1 aggregation_method, GateGradientType gate_gradients, Boolean colocate_gradients_with_ops, Tensor grad_loss) in C:\Users\eric\dev\src\github.com\erichiller\mkmrk\vendor\TensorFlow.NET\src\TensorFlowNET.Core\Training\Optimizer.cs:line 399
   at Tensorflow.Optimizer.minimize(Tensor loss, IVariableV1 global_step, List`1 var_list, GateGradientType gate_gradients, Nullable`1 aggregation_method, Boolean colocate_gradients_with_ops, String name, Tensor grad_loss) in C:\Users\eric\dev\src\github.com\erichiller\mkmrk\vendor\TensorFlow.NET\src\TensorFlowNET.Core\Training\Optimizer.cs:line 116
   at mkmrk.ML.DigitRecognitionRNN.BuildGraph() in C:\Users\eric\dev\src\github.com\erichiller\mkmrk\src\ML\DemoRNN.cs:line 80
   at mkmrk.ML.DigitRecognitionRNN.Run() in C:\Users\eric\dev\src\github.com\erichiller\mkmrk\src\ML\DemoRNN.cs:line 59
   at mkmrk.ML.Program.Main(String[] args) in C:\Users\eric\dev\src\github.com\erichiller\mkmrk\src\ML\Demo.cs:line 89

This seems to be erroring about there being no "Trainable Variables"

DigitRecognitionLSTM

Unhandled exception. System.NullReferenceException: Object reference not set to an instance of an object.
   at Tensorflow.tensor_util._ConstantValue(Tensor tensor, Boolean partial) in C:\Users\eric\dev\src\github.com\erichiller\mkmrk\vendor\TensorFlow.NET\src\TensorFlowNET.Core\Tensors\tensor_util.cs:line 55
   at Tensorflow.tensor_util.constant_value(Tensor tensor, Boolean partial) in C:\Users\eric\dev\src\github.com\erichiller\mkmrk\vendor\TensorFlow.NET\src\TensorFlowNET.Core\Tensors\tensor_util.cs:line 46
   at Tensorflow.Operations.rnn_cell_impl._concat(Tensor prefix, Int32 suffix, Boolean static) in C:\Users\eric\dev\src\github.com\erichiller\mkmrk\vendor\TensorFlow.NET\src\TensorFlowNET.Core\Operations\NnOps\rnn_cell_impl.cs:line 29
   at Tensorflow.RnnCell.<>c__DisplayClass10_0.<_zero_state_tensors>b__0(Int32 s) in C:\Users\eric\dev\src\github.com\erichiller\mkmrk\vendor\TensorFlow.NET\src\TensorFlowNET.Core\Operations\NnOps\RNNCell.cs:line 98
   at System.Linq.Enumerable.SelectListIterator`2.ToList()
   at System.Linq.Enumerable.ToList[TSource](IEnumerable`1 source)
   at Tensorflow.Util.nest.map_structure[T](Func`2 func, T structure) in C:\Users\eric\dev\src\github.com\erichiller\mkmrk\vendor\TensorFlow.NET\src\TensorFlowNET.Core\Util\nest.py.cs:line 518
   at Tensorflow.RnnCell._zero_state_tensors(Object state_size, Tensor batch_size, TF_DataType dtype) in C:\Users\eric\dev\src\github.com\erichiller\mkmrk\vendor\TensorFlow.NET\src\TensorFlowNET.Core\Operations\NnOps\RNNCell.cs:line 96
   at Tensorflow.RnnCell.<>c__DisplayClass9_0.<zero_state>b__0(NameScope <p0>) in C:\Users\eric\dev\src\github.com\erichiller\mkmrk\vendor\TensorFlow.NET\src\TensorFlowNET.Core\Operations\NnOps\RNNCell.cs:line 86
   at Tensorflow.Binding.tf_with[T](T py, Action`1 action) in C:\Users\eric\dev\src\github.com\erichiller\mkmrk\vendor\TensorFlow.NET\src\TensorFlowNET.Core\Binding.Util.cs:line 180
   at Tensorflow.RnnCell.zero_state(Tensor batch_size, TF_DataType dtype) in C:\Users\eric\dev\src\github.com\erichiller\mkmrk\vendor\TensorFlow.NET\src\TensorFlowNET.Core\Operations\NnOps\RNNCell.cs:line 84
   at Tensorflow.RnnCell.get_initial_state(Tensor inputs, Tensor batch_size, TF_DataType dtype) in C:\Users\eric\dev\src\github.com\erichiller\mkmrk\vendor\TensorFlow.NET\src\TensorFlowNET.Core\Operations\NnOps\RNNCell.cs:line 72
   at Tensorflow.Operations.rnn.<>c__DisplayClass3_0.<dynamic_rnn>b__0(variable_scope scope) in C:\Users\eric\dev\src\github.com\erichiller\mkmrk\vendor\TensorFlow.NET\src\TensorFlowNET.Core\Operations\NnOps\rnn.cs:line 219
   at Tensorflow.Binding.tf_with[TIn,TOut](TIn py, Func`2 action) in C:\Users\eric\dev\src\github.com\erichiller\mkmrk\vendor\TensorFlow.NET\src\TensorFlowNET.Core\Binding.Util.cs:line 196
   at Tensorflow.Operations.rnn.dynamic_rnn(RnnCell cell, Tensor inputs_tensor, Tensor sequence_length, Tensor initial_state, TF_DataType dtype, Nullable`1 parallel_iterations, Boolean swap_memory, Boolean time_major) in C:\Users\eric\dev\src\github.com\erichiller\mkmrk\vendor\TensorFlow.NET\src\TensorFlowNET.Core\Operations\NnOps\rnn.cs:line 197
   at mkmrk.ML.DigitRecognitionLSTM.BuildGraph() in C:\Users\eric\dev\src\github.com\erichiller\mkmrk\src\ML\DemoLSTM.cs:line 107
   at mkmrk.ML.DigitRecognitionLSTM.Run() in C:\Users\eric\dev\src\github.com\erichiller\mkmrk\src\ML\DemoLSTM.cs:line 67
   at mkmrk.ML.Program.Main(String[] args) in C:\Users\eric\dev\src\github.com\erichiller\mkmrk\src\ML\Demo.cs:line 52

This appears to be an issue with reshaping and the batch_size being invalid.

Keras LSTM

Unhandled exception. System.NotImplementedException
   at Tensorflow.Keras.Engine.Layer.Call(Tensors inputs, Tensor state, Boolean is_training) in C:\Users\eric\dev\src\github.com\erichiller\mkmrk\vendor\TensorFlow.NET\src\TensorFlowNET.Core\Keras\Engine\Layer.cs:line 161
   at Tensorflow.Keras.Layers.LSTM.Call(Tensors inputs, Tensor state, Boolean is_training) in C:\Users\eric\dev\src\github.com\erichiller\mkmrk\vendor\TensorFlow.NET\src\TensorFlowNET.Core\Keras\Layers\LSTM.cs:line 34
   at Tensorflow.Keras.Engine.Layer.<>c__DisplayClass1_0.<Apply>b__0(NameScope scope) in C:\Users\eric\dev\src\github.com\erichiller\mkmrk\vendor\TensorFlow.NET\src\TensorFlowNET.Core\Keras\Engine\Layer.Apply.cs:line 49
   at Tensorflow.Binding.tf_with[T](T py, Action`1 action) in C:\Users\eric\dev\src\github.com\erichiller\mkmrk\vendor\TensorFlow.NET\src\TensorFlowNET.Core\Binding.Util.cs:line 180
   at Tensorflow.Keras.Engine.Layer.Apply(Tensors inputs, Tensor state, Boolean is_training) in C:\Users\eric\dev\src\github.com\erichiller\mkmrk\vendor\TensorFlow.NET\src\TensorFlowNET.Core\Keras\Engine\Layer.Apply.cs:line 44
   at mkmrk.ML.LstmModelA.Call(Tensors inputs, Tensor state, Boolean is_training) in C:\Users\eric\dev\src\github.com\erichiller\mkmrk\src\ML\Demo.cs:line 221
   at Tensorflow.Keras.Engine.Layer.<>c__DisplayClass1_0.<Apply>b__0(NameScope scope) in C:\Users\eric\dev\src\github.com\erichiller\mkmrk\vendor\TensorFlow.NET\src\TensorFlowNET.Core\Keras\Engine\Layer.Apply.cs:line 49
   at Tensorflow.Binding.tf_with[T](T py, Action`1 action) in C:\Users\eric\dev\src\github.com\erichiller\mkmrk\vendor\TensorFlow.NET\src\TensorFlowNET.Core\Binding.Util.cs:line 180
   at Tensorflow.Keras.Engine.Layer.Apply(Tensors inputs, Tensor state, Boolean is_training) in C:\Users\eric\dev\src\github.com\erichiller\mkmrk\vendor\TensorFlow.NET\src\TensorFlowNET.Core\Keras\Engine\Layer.Apply.cs:line 44
   at mkmrk.ML.LstmModelA.Optimize(Tensor x, Tensor y) in C:\Users\eric\dev\src\github.com\erichiller\mkmrk\src\ML\Demo.cs:line 245
   at mkmrk.ML.TestRun.Train() in C:\Users\eric\dev\src\github.com\erichiller\mkmrk\src\ML\Demo.cs:line 145
   at mkmrk.ML.TestRun.Run() in C:\Users\eric\dev\src\github.com\erichiller\mkmrk\src\ML\Demo.cs:line 123
   at mkmrk.ML.Program.Main(String[] args) in C:\Users\eric\dev\src\github.com\erichiller\mkmrk\src\ML\Demo.cs:line 86

This seems to be occurring because vendor\TensorFlow.NET\src\TensorFlowNET.Core\Keras\Layers\LSTM.cs does not implement the required method (Call in TensorFlow.NET\src\TensorFlowNET.Core\Keras\Engine\Layer.cs, line 160) , I am guessing that this part of the library is not yet complete?

Let me know if I'm doing something wrong on my end too, or if I can help in some way.

erichiller avatar Nov 07 '20 01:11 erichiller

@erichiller Need to wait for those model to be fixed. Recently we're focused on building CNN model in Keras functional API.

Oceania2018 avatar Nov 08 '20 22:11 Oceania2018

Ok, thanks! Any idea on an eta? Depending on the difficulty of the modifications, I could try to help out, if it is just a matter of converting the same logic from Python to C#, but if there are massive changes required, it will be beyond me. Is this the correct folder I should be looking in? Or should I just wait it out? It works in v0.15 if I'm not mistaken.

erichiller avatar Nov 09 '20 13:11 erichiller

@erichiller If possible, you need to get familiar with porting Unit Tests from native from google to Tensorflow.NET as discussed here.

Then if will have the "helicopter" view where are the missing parts and where the community can contribute in complementary way.

GeorgeS2019 avatar Nov 23 '20 13:11 GeorgeS2019

@erichiller I think you can get started with just a simple unit test (from docs) like:

inputs = np.random.random([32, 10, 8]).astype(np.float32)
simple_rnn = tf.keras.layers.SimpleRNN(4)
output = simple_rnn(inputs)  # The output has shape `[32, 4]`.

Absolutely we can help even you PR the buggy code.

Oceania2018 avatar Dec 22 '20 15:12 Oceania2018

Closing due to RNN and LSTM has been developed, please reopen if you still have question, thanks!

Wanglongzhi2001 avatar Nov 13 '23 16:11 Wanglongzhi2001