TensorFlow.jl icon indicating copy to clipboard operation
TensorFlow.jl copied to clipboard

Feature request: Add new operations defined in Julia

Open xiuliren opened this issue 7 years ago • 16 comments

Thanks for this great package. Are all the operators need to be implemented in C++?

xiuliren avatar Mar 20 '17 14:03 xiuliren

Thanks! For now, they do. Google plans to eventually enable creating new operations on the fly from C (see https://github.com/tensorflow/tensorflow/blob/master/tensorflow/c/c_api.h#L922), at which point it should be possible to define new operations in Julia.

malmaud avatar Mar 20 '17 16:03 malmaud

cool! thanks for the quick response. we can keep this issue here to track this capability?

xiuliren avatar Mar 20 '17 17:03 xiuliren

Sure. On Mon, Mar 20, 2017 at 1:05 PM Jingpeng Wu [email protected] wrote:

cool! thanks for the quick response. we can keep this issue here to track this capability?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/malmaud/TensorFlow.jl/issues/181#issuecomment-287827417, or mute the thread https://github.com/notifications/unsubscribe-auth/AA8SvVz6ODzViUhepCYQgFeEirNq3DY1ks5rnrHAgaJpZM4Mieuc .

malmaud avatar Mar 20 '17 17:03 malmaud

I don't overly see the point in being able to create operators directly in Julia.

TensorFlow is more than turing complete; any new operators one desires can easily be implemented inside Julia, out of the parts we already have. (Or if required out of parts that we can add from the C API)

See for example here

In a practical sense we can define all the operations we want. The main advantage that I can see to defining something as a proper operation, rather than just building it from parts, is that it will be accessible too all language bindings of tensorflow. But that only applies if it is in the C API. And so more or less need to be written in C++.

oxinabox avatar Mar 21 '17 01:03 oxinabox

@oxinabox your example is pretty cool, but there is only one function in the session. Is that possible to compose multiple Julia functions with/without TF functions and use TF to parallelly schedule functions?

xiuliren avatar Mar 21 '17 01:03 xiuliren

@jingpengwu I'm not sure i understand the question.

oxinabox avatar Mar 21 '17 01:03 oxinabox

@oxinabox Here is an example There is only one Julia function in the TensorFlow session, is that possible to add more functions? After doing so, can we expect that TensorFlow will build a computational graph of Julia functions, and explore parallelism inside the compuational graph.

xiuliren avatar Mar 21 '17 02:03 xiuliren

It isn't exactly building a computational graph of Julia functions. The julia functions (like for all operations available in TensorFlow.jl) return Tensors. So they are building the Computational Graph, which is given to the C tensorflow library to execute.

You can certainly use many julia functions that return Tensors as part of one graph. Eg here The graph doesn't care how it is built.

oxinabox avatar Mar 21 '17 02:03 oxinabox

Your example is still based on TF functions, such as gather_nd, to manipulate the computation graph. Can we use normal Julia function, such as apply_mask(V, mask) = V.*mask, as an node of the computation graph?

xiuliren avatar Mar 21 '17 02:03 xiuliren

Can we use normal Julia function, such as apply_mask(V, mask) = V.*mask, as an node of the computation graph?

Generally, no. And as I said before I'm not sure what it would be useful for. (I'm not sure it wouldn't be useful)

Consider that V.*mask is in fact still a TF function, as you are calling it. It is define around here It is a wrapper around the Tensorflow C definition for multiply.

I don't think we are too far from the point where there is enough of this kind of thing, that it it becomes really hard to differentiate functions that are for julia on AbstractArrays, and functions that are for TensorFlow on Tensors.

It is approaching a nice and nearly transperent syntax. We are getting to the point where most indexing operations work, and now while loops mostly work. (This is some sweet stuff)

oxinabox avatar Mar 21 '17 03:03 oxinabox

Oh does this mean: this

I think, maybe, just maybe, this could be done somehow via CXX.jl?

@stevengj mentioned this in another issue far away.

oxinabox avatar May 25 '17 16:05 oxinabox

yep, it is adding an operator, but based on pure Julia rather than C++.

xiuliren avatar May 25 '17 17:05 xiuliren

Exactly. Once you can define an operator that calls back to an arbitrary pure-Julia function, then you potentially get a whole bunch of things (like fusing broadcasts) for free.

Note that in principle, you may only need to define one C++ op (or use CXX.jl) that stores a handle to a Julia callback function as internal state, and maybe another function to (optionally) compute the gradient. Then each time you have a new Julia function, you just instantiate a new instance of the Op and pass the Julia function in its constructor.

stevengj avatar May 25 '17 17:05 stevengj

A complication is that REGISTER_OP in TensorFlow requires you to specify the input and output types, which doesn't map well onto Julia functions (that may allow multiple types). However, you can at least register callback-based ops for a few common cases (integers or floating-point values in and out).

stevengj avatar May 25 '17 17:05 stevengj

There are a few practical problems with defining an atomic TensorFlow operator for arbitrary Julia functions:

  • We would need to distribute custom TensorFlow binaries if we want to use custom ops, until CXX.jl is supported robustly across everyone's Julia distributions.
  • AFAIK, there wouldn't be a way to plug in Julia callbacks to TensorFlow's GPU or TPU backend, which is the majority of the use-case.
  • Most users of TensorFlow.jl will want automatic gradients for implementing gradient descent. TensorFlow gives you automatic gradients for free when you represent a computation as a composite of TensorFlow operations; if you only have a single atomic TensorFlow operator with a Julia callback, then all the responsibility of automatic differentiation falls to TensorFlow.jl.

malmaud avatar May 25 '17 19:05 malmaud

Regarding gradients, with custom operations my understanding is that TensorFlow allows you to supply a gradient function too. e.g. you could automate this with ForwardDiff applied to the broadcast operand. Then the rest of TensorFlow's gradient machinery would work when this operation is composed with other TensorFlow operations.

stevengj avatar May 25 '17 21:05 stevengj