minpy
minpy copied to clipboard
Can I create a variable shared by forward and back propagation in customop('numpy')?
I do not quite understand the mechanism behind @customop('numpy'). I find that there's an intermediate variable 'Q' which is expensive to compute and appears in computing both the output and gradient. By the way, I also wonder if I can create gradient function for multiple parameters (e.g. w1 and w2 as in the code below). e.g.
@customop('numpy')
def my_operator(X,w1,w2):
Q = f(X,w1,w2)
H = g1(Q)
return H
def my_operator_grad1(ans,X,w1,w2):
def grad1(g):
Q = f(X,w1,w2)
R = g2(Q)
return R
return grad1
def my_operator_grad2(ans,X,w1,w2):
def grad2(g):
Q = f(X,w1,w2)
R = g3(Q)
return R
return grad2
my_operator.def_grad(my_operator_grad1,argnum=1)
my_operator.def_grad(my_operator_grad2,argnum=2)
Thanks!
@swanderingf One of the primary reasons for the customop
wrapper is some operations are not defined in GPU.
Say, I have a function using some operations defined only in CPU. Without the customop
hint, the input data could be stored in GPU before the invocation. However, in execution, the intermediate data is copied from GPU to CPU in order to run the cpu-defined op, which hurts the performance and invalidate the action of converting the input data in GPU.
The customop
enables the user to tell the system where to save the input data and the some of the slow data copies between different devices can be avoided.
Sharing the mutual computation is supported in minpy by def_multiple_grad
. You can re-write the code by:
@customop('numpy')
def my_operator(X, w1, w2):
Q = f(X,w1,w2)
H = g1(Q)
return H
def my_operator_grads(ans, X, w1, w2):
def grad(g):
Q = f(X,w1,w2)
return (g2(Q), g3(Q))
return grad
my_operator.def_multiple_grad(my_operator_grads, (0, 1))