phylanx
phylanx copied to clipboard
Request for implementing Keras activation functions
Keras has a few activation functions in its backend that we might want to implement in Phylanx. Phylanx supports softmax
and tanh
and we can add:
@taless474 would you mind adding links describing the actual math to implement for the primitives you listed?
@hkaiser, I added the link to their description in the Keras documentation which contains their numpy implementation. Do you need the actual math behind them from scholar articles, too?
@taless474 those links are fine, thank you!
To support Keras, we can also add:
- [x] one_hot (#871)
- [x] categorical_crossentropy, implementation in NumPy backend (#893)
- [x] binary_crossentropy, implementation in NumPy backend (#945)
- [ ] sparse_categorical_crossentropy, implementation in cntk backend
- [x] resize_images where the interpolation is
bilinear
, please find the example in the following comments (#908) - [x] the equivalent of NumPy size (#867)
- [x] slice, its implementation in the NumPy backend (#876 and #921)
- [x] ctc_decode, implementation in NumPy backend (#938)
- [x] map_fn, foldl and foldr (#931, #932)
- [ ] placeholder
- [x] cast (#954 adds
astype
) - [ ] shape, please refer to below comment for explanation
- [x] batch_dot (#877)
- [x] l2_normalize (#881)
- [x] pool2d and pool3d (#902)
- [x] switch (#935)
- [x] in_top_k, refer to NumPy implementation (#963)
- [x] bias_add, NumPy implementation (#1031)
- [ ] rnn
- [x] conv1d (#955, #991)
- [x] conv2d (#969, #1042)
- [x] conv2d_transpose (#1051)
- [x] separable_conv1d (#993)
- [ ] separable_conv2d
- [ ] depthwise_conv2d
- [ ] conv3d
- [ ] conv3d_transpose
Keras slice has the same functionality as TensorFlow slice. Having an nD input array, start
and size
are vectors of size n.
To slice a matrix, first we find the element with indices in the start (in the following example start is [0,2]
which is the element in 0th row and 2nd column, element 3
) as the most top left element of the output submatrix, then we slice a submatrix with the given size (in the following example the size is [2,1]
, so the output is 2x1
).
m = tf.constant([[1,2,3],[4,5,6]])
s = tf.slice(m,[0,2],[2,1])
k.get_value(s) # [[3], [6]]
Slicing a 2x1
start at element in the 0th row and 2nd column, we get a 2x1
matrix as [[3], [6]]
.
The following is another example of a 2x1x3
tensor. The element in position (1,0,0)
is 4
and we get a 1x1x3
tensor by slicing it.
t = tf.constant([[[1,2,3]],[[4,5,6]]])
s = tf.slice(t,[1,0,0],[1,1,3])
k.get_value(s) #[[[4, 5, 6]]]
@taless474 thank you for you clarifications. IIUC, this is equivalent to the 'normal' python/numpy slicing:
m = tf.constant([[1,2,3],[4,5,6]])
s = tf.slice(m,[0,2],[2,1])
# same as
m = np.array([[1,2,3],[4,5,6]])
s = m[0:2,2:3]
Would you see a way to map that (as this is supported by PhySL - at least partially)?
@taless474 the same is true for:
t = tf.constant([[[1,2,3]],[[4,5,6]]])
s = tf.slice(t,[1,0,0],[1,1,3])
which is equivalent to
t = np.array([[[1,2,3]],[[4,5,6]]])
s = t[1:2,0:1,0:3] # syntactic sugar for: s = t[(slice(1,2),slice(0,1),slice(0,3))]
To dynamically build this up we could do:
import numpy as np
def _slice(x, start, size):
indices = [slice(i, i+j) for i, j in zip(start, size)]
return x[tuple(indices)]
t = np.array([[[1,2,3]],[[4,5,6]]])
print(_slice(t, [1,0,0], [1,1,3])) # prints '[[[4 5 6]]]'
Keras backend has two functions for shape. The shape returns the symbolic shape of a tensor or variable. TensorFlow returns something like <tf.Tensor 'Shape_8:0' shape=(2,) dtype=int32>
which includes the name ('Shape_8:0`), the shape (which is the ndim before eval (shape=(2,)) and the actual shape after applying eval) and the dtype.
We should decide how we want to have this in the keras backend.
# TensorFlow example
>>> from keras import backend as K
>>> tf_session = K.get_session()
>>> val = np.array([[1, 2], [3, 4]])
>>> kvar = K.variable(value=val)
>>> inputs = keras.backend.placeholder(shape=(2, 4, 5))
>>> K.shape(kvar)
<tf.Tensor 'Shape_8:0' shape=(2,) dtype=int32>
>>> K.shape(inputs)
<tf.Tensor 'Shape_9:0' shape=(3,) dtype=int32>
# To get integer shape (Instead, you can use K.int_shape(x))
>>> K.shape(kvar).eval(session=tf_session)
array([2, 2], dtype=int32)
>>> K.shape(inputs).eval(session=tf_session)
array([2, 4, 5], dtype=int32)
print_tensor has similar functionality, but does not need eval
:
>>> K.print_tensor(kvar)
<tf.Tensor 'Print_3:0' shape=(2, 2) dtype=float32>
>>> K.print_tensor(inputs)
<tf.Tensor 'Print_4:0' shape=(2, 4, 5) dtype=float32>
I'm working on elu on my side, you can check the progress I've made so far on my fork. Most of the implementation has been done, I need to test it now.
To enlarge images, Keras uses resize_images. Enlargement can be along the height, width or both and is to determine with height_factor
and width_factor
. data_format
tells us which convention is used to store images (if the channels are put at first or in the end).
There are two methods of filling between points, nearest
and bilinear
. nearest
copies the nearest data point value (so has the same functionality as np.repeat
); bilinear
generates the new values by interpolating between two data points (similar to np.linspace
).
In the following example, we want to enlarge the image with a 2 height_factor
and a 2 width_factor
:
x = tf.constant([[[[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12]],
[[-1, -2, -3, -4],
[-5, -6, -7, -8],
[-9,-10,-11,-12]]]])
print(K.int_shape(x))
data_format = "channels_first"
height_factor, width_factor = 2, 2
interpolation = "bilinear"
a = K.resize_images(x, height_factor, width_factor, data_format, interpolation)
print(K.int_shape(a))
K.get_value(a)
(1, 2, 3, 4)
(1, 2, 6, 8)
array([[[[ 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4. ],
[ 3. , 3.5, 4. , 4.5, 5. , 5.5, 6. , 6. ],
[ 5. , 5.5, 6. , 6.5, 7. , 7.5, 8. , 8. ],
[ 7. , 7.5, 8. , 8.5, 9. , 9.5, 10. , 10. ],
[ 9. , 9.5, 10. , 10.5, 11. , 11.5, 12. , 12. ],
[ 9. , 9.5, 10. , 10.5, 11. , 11.5, 12. , 12. ]],
[[ -1. , -1.5, -2. , -2.5, -3. , -3.5, -4. , -4. ],
[ -3. , -3.5, -4. , -4.5, -5. , -5.5, -6. , -6. ],
[ -5. , -5.5, -6. , -6.5, -7. , -7.5, -8. , -8. ],
[ -7. , -7.5, -8. , -8.5, -9. , -9.5, -10. , -10. ],
[ -9. , -9.5, -10. , -10.5, -11. , -11.5, -12. , -12. ],
[ -9. , -9.5, -10. , -10.5, -11. , -11.5, -12. , -12. ]]]], dtype=float32)
map_fn, foldl and foldr have two common arguments, fn
(callable) and elems
.
Here is an example of how foldl
is used:
import numpy as np
from phylanx import Phylanx, PhylanxSession, execution_tree
PhylanxSession.init(1)
def variable(value, dtype=None, name=None):
return execution_tree.var(np.array(value, dtype=dtype))
def eval(func):
return func.eval()
@Phylanx
def foldl_eager(fn, elems, initializer):
return fold_left(fn, initializer, elems)
def foldl(fn, elems, initializer=None):
return foldl_eager.lazy(fn, elems, initializer)
def test_foldl():
x = np.random.rand(10, 3).astype(np.float32)
kx = eval(foldl(lambda a, b: a + b, variable(x)))
assert (3,) == kx.shape
assert_allclose(x.sum(axis=0), kx, atol=1e-05)
test_foldl()
@taless474 The problem with this code is that it passes a pure Python function to the Phylanx decorator. We will have think on how to support this kind of construct. I don't think we will be able to transform this into a PhySL expression at runtime (@rtohid ?).
What we perhaps could do is to allow for pure Python code to be executed from inside Phylanx, i.e. create a primitive that invokes some pre-compiled piece of Python (calls back into the Python interpreter).
Keras switch is basically an element-wise if_conditional
. If the condition has lower dimensions in comparison to then
and else
, the condition will be broadcasted (it will be reshaped to (oldshape, 1) or (oldshape, 1, 1) ,... before broadcasting). Check NumPy implementation here.
from keras import backend as K
x = K.constant([1,-2])
then_expr = K.constant([[10,20],[30,40]])
else_expr = K.constant([[-10,-20],[-30,-40]])
a = K.switch(K.greater_equal(x,0.5), then_expr,else_expr)
K.get_value(a)
results in:
array([[ 10., 20.],
[-30., -40.]], dtype=float32)
We have discussed this and came to the conclusion that switch
is very similar to the already existing where
primitive, except for the broadcasting rules to be applied to the conditional. Also, switch
assumes that the two result arrays have the same shape.