TensorFlow.NET
TensorFlow.NET copied to clipboard
.Net keras does not converge compared to python keras
This does not converge:
using static Tensorflow.KerasApi;
using static Tensorflow.tensorflow;
using Tensorflow;
using Tensorflow.NumPy;
var inputs = np.array(new float[,] { { 0, 0 }, { 0, 1 }, { 1, 0 }, { 1, 1 } });
var outputs = np.array(new float[] { 0, 1, 1, 0 });
var model = keras.Sequential();
model.add(keras.layers.InputLayer(new Shape(2)));
model.add(keras.layers.Dense(4, keras.activations.Tanh));
model.add(keras.layers.Dense(4, keras.activations.Tanh));
model.add(keras.layers.Dense(1, keras.activations.Sigmoid));
model.compile(keras.optimizers.SGD(0.1f), keras.losses.MeanSquaredError(), new [] { "mae" });
model.fit(inputs, outputs, epochs: 1000);
var pred_outputs = model.predict(inputs);
foreach (var output in pred_outputs)
{
Console.WriteLine(string.Join(",", output.ToArray<float>()));
}
while this equivalent does:
import tensorflow as tf
import timeit
import numpy as np
device_name = tf.test.gpu_device_name()
if device_name != '/device:GPU:0':
print(
'\n\nThis error most likely means that this notebook is not '
'configured to use a GPU. Change this in Notebook Settings via the '
'command palette (cmd/ctrl-shift-P) or the Edit menu.\n\n')
raise SystemError('GPU device not found')
def gpu():
with tf.device('/device:GPU:0'):
model = tf.keras.Sequential()
model.add(tf.keras.Input(shape=2))
model.add(tf.keras.layers.Dense(4, activation="tanh"))
model.add(tf.keras.layers.Dense(4, activation="tanh"))
model.add(tf.keras.layers.Dense(1, activation="sigmoid"))
model.compile(tf.keras.optimizers.SGD(0.1), loss="mse", metrics=["mae"])
model.fit([[1,1], [1,0], [0,1], [0,0]], [[0],[1],[1],[0]], epochs= 1000 )
print(model.predict([[1,1], [1,0], [0,1], [0,0]]))
# We run each op once to warm up; see: https://stackoverflow.com/a/45067900
gpu()
Dependencies:
<PackageReference Include="NumSharp" Version="0.30.0" />
<PackageReference Include="SciSharp.TensorFlow.Redist" Version="2.10.0" />
<PackageReference Include="SciSharp.TensorFlow.Redist-Windows-GPU" Version="2.10.0" />
<PackageReference Include="TensorFlow.Keras" Version="0.10.0" />
<PackageReference Include="TensorFlow.NET" Version="0.100.0" />
I run the codes and get loss=0.000765 after 1000th epoch in C# and loss=0.0122 in python.
It seems that it converge in C# but does not in python. Is that the same with your device?
Whatever, it seems that something of tf.net.keras does not align with tf.keras.
@AsakusaRinne In my local environment it does not converge at all:
2023-02-13 23:04:10.069351: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. Epoch: 001/1000 0001/0001 [==============================>] - 231ms/step loss: 0.247893, mean_absolute_error: 0.494114 Epoch: 002/1000 0001/0001 [==============================>] - 5ms/step loss: 0.247893, mean_absolute_error: 0.494114 Epoch: 003/1000 0001/0001 [==============================>] - 3ms/step loss: 0.247893, mean_absolute_error: 0.494114 Epoch: 004/1000 0001/0001 [==============================>] - 4ms/step loss: 0.247893, mean_absolute_error: 0.494114 Epoch: 005/1000 0001/0001 [==============================>] - 7ms/step loss: 0.247893, mean_absolute_error: 0.494114 Epoch: 006/1000 0001/0001 [==============================>] - 4ms/step loss: 0.247893, mean_absolute_error: 0.494114 Epoch: 007/1000 0001/0001 [==============================>] - 5ms/step loss: 0.247893, mean_absolute_error: 0.494114 Epoch: 008/1000 0001/0001 [==============================>] - 5ms/step loss: 0.247893, mean_absolute_error: 0.494114 Epoch: 009/1000 0001/0001 [==============================>] - 5ms/step loss: 0.247893, mean_absolute_error: 0.494114 Epoch: 010/1000 0001/0001 [==============================>] - 5ms/step loss: 0.247893, mean_absolute_error: 0.494114 Epoch: 011/1000 0001/0001 [==============================>] - 4ms/step loss: 0.247893, mean_absolute_error: 0.494114 Epoch: 012/1000 0001/0001 [==============================>] - 3ms/step loss: 0.247893, mean_absolute_error: 0.494114 Epoch: 013/1000 0001/0001 [==============================>] - 3ms/step loss: 0.247893, mean_absolute_error: 0.494114 Epoch: 014/1000 0001/0001 [==============================>] - 3ms/step loss: 0.247893, mean_absolute_error: 0.494114 Epoch: 015/1000 0001/0001 [==============================>] - 5ms/step loss: 0.247893, mean_absolute_error: 0.494114 .... Later predicted outputs are as follows: 0.5, 0.43938702, 0.47922894, 0.39507216
@jjaskulowski Could you please provide the version of tf.net and tf.net.keras you used and the device information?
@AsakusaRinne
The app was a simple new c# core command line app created in vs 2022 pro.
<PackageReference Include="NumSharp" Version="0.30.0" />
<PackageReference Include="SciSharp.TensorFlow.Redist" Version="2.10.0" />
<PackageReference Include="SciSharp.TensorFlow.Redist-Windows-GPU" Version="2.10.0" />
<PackageReference Include="TensorFlow.Keras" Version="0.10.0" />
<PackageReference Include="TensorFlow.NET" Version="0.100.0" />
OS Name: Microsoft Windows 10 Pro
OS Version: 10.0.19044 N/A Build 19044
OS Manufacturer: Microsoft Corporation
OS Configuration: Standalone Workstation
OS Build Type: Multiprocessor Free
Registered Owner: N/A
Registered Organization: N/A
Product ID: 00342-50478-94012-AAOEM
Original Install Date: 11/05/2022, 11:29:07
System Boot Time: 02/02/2023, 20:37:18
System Manufacturer: Dell Inc.
System Model: Latitude E7470
System Type: x64-based PC
Processor(s): 1 Processor(s) Installed.
[01]: Intel64 Family 6 Model 78 Stepping 3 GenuineIntel ~2396 Mhz
BIOS Version: Dell Inc. 1.3.0, 14/02/2016
Windows Directory: C:\WINDOWS
System Directory: C:\WINDOWS\system32
Boot Device: \Device\HarddiskVolume2
System Locale: en-gb;English (United Kingdom)
Input Locale: en-gb;English (United Kingdom)
Time Zone: (UTC+01:00) Sarajevo, Skopje, Warsaw, Zagreb
Total Physical Memory: 16,267 MB
Available Physical Memory: 4,978 MB
Virtual Memory: Max Size: 21,007 MB
Virtual Memory: Available: 2,525 MB
Virtual Memory: In Use: 18,482 MB
Page File Location(s): C:\pagefile.sys
Domain: WORKGROUP
Logon Server: \\DESKTOP-CEEUBV8
Hotfix(s): 13 Hotfix(s) Installed.
[01]: KB5020872
[02]: KB5003791
[03]: KB5012170
[04]: KB5022282
[05]: KB5007273
[06]: KB5014032
[07]: KB5014035
[08]: KB5014671
[09]: KB5015895
[10]: KB5016705
[11]: KB5018506
[12]: KB5020372
[13]: KB5003242
It's quite confusing that I cannot reproduce it...Everything just goes well in my local environment with the dependencies above. Could you please try the following steps to help us locate the error?
- Just rebuild your project and run again.
- Remove the package
SciSharp.TensorFlow.Redist-Windows-GPU
and try again.
BTW, Could you please tell us your CUDA version and dotnet version? @jjaskulowski
@jjaskulowski Hey, have you solved it? I'm quite interested about it. :)
Nope but I'm not sure where to find the data I've been asked for for the issue. I Aldo am computing on cpu. Not sure how is that related to cuda or dnn.
pon., 27 lut 2023, 20:08 użytkownik Yaohui Liu @.***> napisał:
@jjaskulowski https://github.com/jjaskulowski Hey, have you solved it? I'm quite interested about it. :)
— Reply to this email directly, view it on GitHub https://github.com/SciSharp/TensorFlow.NET/issues/983#issuecomment-1446901138, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEZV62DL2TXAVLXBV7KEQETWZT3UPANCNFSM6AAAAAAUY4CBX4 . You are receiving this because you were mentioned.Message ID: @.***>