TensorFlowSharp
TensorFlowSharp copied to clipboard
WIP: gradient support
This requires the C API to get support for it.
There is a bug here: https://github.com/tensorflow/tensorflow/issues/6268
- [ ] Add port of the test suite from tensorflow/908d5b6ede6ae829dff138a873eec397ef434cd6
It seems the gradient support is recently addressed :-) https://github.com/tensorflow/tensorflow/issues/6268
Not quite :-)
Still waiting on it.
Added the basic binding, but I have not written the tests, I updated the bug description to track that.
Additionally, the tensorflow commit [0] not all capabilities from the C++ API have been surfaced yet.
[0] https://github.com/tensorflow/tensorflow/commit/908d5b6ede6ae829dff138a873eec397ef434cd6
Is it now possible to retrieve the gradients and do some form of gradient descent?
I would like to ask the same question - using the new pre-release package that has just been uploaded to NuGet (1.3.0-pre1), is it already possible to retrieve gradients for tensors and do any form of gradient descent at this time? I am not totally currently up-to-date with the status of the C API support for this feature yet, so I guessed it would be easier to ask :-)
@migueldeicaza Hi, I started to verify AddGradients() API with code like this:
var x = graph.Const (3.0);
var y = graph.Square (x);
var y1 = graph.Square (y);
var y2 = graph.Square (y1);
var g = graph.AddGradients (new TFOutput [] { y, y2 }, new [] { x});
var r = session.Run (new TFOutput [] { }, new TFTensor [] { }, g);
double dy = (double)r [0].GetValue ();
double dy2 = (double)r [1].GetValue ();
Assert.Equal (17502.0, dy + dy2);
and got couple of problems: 1. 'cstatus.Handle' should be passed to TF_AddGradients() but not 'status.Handle' variable. 2. I'm expecting two results 6 (y derivative) and 17496 (y2 derivative) in that run according to documentation: d(y[0] + y[1]+ ...)/dx[0], d(y[0] + y[1] + ...)/dx[1]
but the API returns only one result 6. 3. If the number of inputs > 1 the API fails with Access to unprotected memory, for instance var g = graph.AddGradients (new TFOutput [] { y, y2 }, new [] { x, x2});
1 is quite simple to fix what do you think about 2 and 3 ? Is it binding issue or underlying native API? Thanks.
Hi @Dorokhov,
If I understood correctly TensorFlow's documentation for AddGradients, the return vector should have the same length as the inputs vector, therefore the answer would have indeed just one value (cf. the doc: "The partial derivatives are returned in dy
. dy
should be allocated to size nx
." where I understand that in this case, nx
would have been the length of new [] { x }
which would therefore be 1).
But maybe I am wrong, I am just starting with the gradients API. Let me see if I can help find where the problem is.
Regards, Cesar
Hi @cesarsouza,
I think you are right, and the API should produce one value - partial derivatives of 'y' sums which are 17502 (6+17496) in my case, but it returns 6. I will try to test the same case in native API.
Thanks.
I'm struggling with getting simple models built in Python "productionized" into C#. This is what I see as a first class use case for this library, but I am not seeing a good example for it.
Take a model built, trained, and saved in python. Save off the *.pb file (From what i can gather, that is all i need) In C#: Create Graph, a Session, Input Variables, Input Values, and Outputs to run the model. Run the model, then interrogate the output.
The most challenging part for me is constructing the InputVariables, InputValues, and Output. The construction patterns, and instance usages for those objects are befuddling me to say the least.
I assume I am not seeing a pattern in the examples, or that I am fighting the naming conventions, or... I don't know what I am missing... And I can't be the only person struggling with this.
The python code is attached. It's a simple linear regression model I ripped off from a stanford class and made more portable.
Here are the samples i am trying to create. Happy to donate them to the cause when they are complete. FireTheftLinearRegression.py.txt
Hi @sqlBender,
I have to say I also share your pain, but I am not sure if the (actually very relevant) issue you have raised is connected to the original topic of this current issue, that is, the ability to obtain automatic gradient calculations through the AddGradients method.
If you want to take a look, the Keras Sharp project is aiming at providing an API that is very similar to its Python equivalent, but unfortunately that project is also still a bit blocked until this very issue here gets eventually addressed.
Regards, Cesar
@cesarsouza it's just where I found an issue similar. Gradient etc. I'll spin up a new issue since my issue is not really related to this thread.
Started discussing the gradient API issue in the tensorflow repository, hoping we will find how the API actually works soon.
Seems like TF doesn't have all gradient operations defined yet. I am referencing here an issue I've just created in TF's issue tracker regarding a missing gradient for tf.select.
Hi @migueldeicaza, @cesarsouza, The Linear Regression
example can run on another .NET binding library now, check the code here.
Piece of code
// tf Graph Input
var X = tf.placeholder(tf.float32);
var Y = tf.placeholder(tf.float32);
// Set model weights
// We can set a fixed init value in order to debug
// var rnd1 = rng.randn<float>();
// var rnd2 = rng.randn<float>();
var W = tf.Variable(-0.06f, name: "weight");
var b = tf.Variable(-0.73f, name: "bias");
// Construct a linear model
var pred = tf.add(tf.multiply(X, W), b);
// Mean squared error
var cost = tf.reduce_sum(tf.pow(pred - Y, 2.0f)) / (2.0f * n_samples);
// gradient descent
// Note, minimize() knows to modify W and b because Variable objects are trainable=True by default
var optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost);
// Initialize the variables (i.e. assign their default value)
var init = tf.global_variables_initializer();