phylanx icon indicating copy to clipboard operation
phylanx copied to clipboard

Request for implementing Keras activation functions

Open taless474 opened this issue 5 years ago • 15 comments

Keras has a few activation functions in its backend that we might want to implement in Phylanx. Phylanx supports softmax and tanh and we can add:

taless474 avatar Mar 11 '19 15:03 taless474

@taless474 would you mind adding links describing the actual math to implement for the primitives you listed?

hkaiser avatar Mar 11 '19 16:03 hkaiser

@hkaiser, I added the link to their description in the Keras documentation which contains their numpy implementation. Do you need the actual math behind them from scholar articles, too?

taless474 avatar Mar 11 '19 17:03 taless474

@taless474 those links are fine, thank you!

hkaiser avatar Mar 11 '19 17:03 hkaiser

To support Keras, we can also add:

taless474 avatar Mar 12 '19 15:03 taless474

Keras slice has the same functionality as TensorFlow slice. Having an nD input array, start and size are vectors of size n.

To slice a matrix, first we find the element with indices in the start (in the following example start is [0,2] which is the element in 0th row and 2nd column, element 3) as the most top left element of the output submatrix, then we slice a submatrix with the given size (in the following example the size is [2,1], so the output is 2x1).

m = tf.constant([[1,2,3],[4,5,6]])
s = tf.slice(m,[0,2],[2,1])
k.get_value(s) # [[3], [6]]

Slicing a 2x1 start at element in the 0th row and 2nd column, we get a 2x1 matrix as [[3], [6]]. The following is another example of a 2x1x3 tensor. The element in position (1,0,0) is 4 and we get a 1x1x3 tensor by slicing it.

t = tf.constant([[[1,2,3]],[[4,5,6]]])
s = tf.slice(t,[1,0,0],[1,1,3])
k.get_value(s) #[[[4, 5, 6]]]

taless474 avatar Mar 13 '19 22:03 taless474

@taless474 thank you for you clarifications. IIUC, this is equivalent to the 'normal' python/numpy slicing:

m = tf.constant([[1,2,3],[4,5,6]])
s = tf.slice(m,[0,2],[2,1])
# same as
m = np.array([[1,2,3],[4,5,6]])
s = m[0:2,2:3]

Would you see a way to map that (as this is supported by PhySL - at least partially)?

hkaiser avatar Mar 14 '19 00:03 hkaiser

@taless474 the same is true for:

t = tf.constant([[[1,2,3]],[[4,5,6]]])
s = tf.slice(t,[1,0,0],[1,1,3])

which is equivalent to

t = np.array([[[1,2,3]],[[4,5,6]]])
s = t[1:2,0:1,0:3]    # syntactic sugar for:  s = t[(slice(1,2),slice(0,1),slice(0,3))]

hkaiser avatar Mar 14 '19 00:03 hkaiser

To dynamically build this up we could do:

import numpy as np

def _slice(x, start, size):
    indices = [slice(i, i+j) for i, j in zip(start, size)]
    return x[tuple(indices)]

t = np.array([[[1,2,3]],[[4,5,6]]])
print(_slice(t, [1,0,0], [1,1,3]))  # prints '[[[4 5 6]]]'

hkaiser avatar Mar 14 '19 01:03 hkaiser

Keras backend has two functions for shape. The shape returns the symbolic shape of a tensor or variable. TensorFlow returns something like <tf.Tensor 'Shape_8:0' shape=(2,) dtype=int32> which includes the name ('Shape_8:0`), the shape (which is the ndim before eval (shape=(2,)) and the actual shape after applying eval) and the dtype. We should decide how we want to have this in the keras backend.

# TensorFlow example
>>> from keras import backend as K
>>> tf_session = K.get_session()
>>> val = np.array([[1, 2], [3, 4]])
>>> kvar = K.variable(value=val)
>>> inputs = keras.backend.placeholder(shape=(2, 4, 5))
>>> K.shape(kvar)
<tf.Tensor 'Shape_8:0' shape=(2,) dtype=int32>
>>> K.shape(inputs)
<tf.Tensor 'Shape_9:0' shape=(3,) dtype=int32>
# To get integer shape (Instead, you can use K.int_shape(x))
>>> K.shape(kvar).eval(session=tf_session)
array([2, 2], dtype=int32)
>>> K.shape(inputs).eval(session=tf_session)
array([2, 4, 5], dtype=int32)

print_tensor has similar functionality, but does not need eval:

>>> K.print_tensor(kvar)
<tf.Tensor 'Print_3:0' shape=(2, 2) dtype=float32>
>>> K.print_tensor(inputs)
<tf.Tensor 'Print_4:0' shape=(2, 4, 5) dtype=float32>

taless474 avatar Mar 14 '19 19:03 taless474

I'm working on elu on my side, you can check the progress I've made so far on my fork. Most of the implementation has been done, I need to test it now.

JPenuchot avatar Mar 18 '19 20:03 JPenuchot

To enlarge images, Keras uses resize_images. Enlargement can be along the height, width or both and is to determine with height_factor and width_factor. data_format tells us which convention is used to store images (if the channels are put at first or in the end). There are two methods of filling between points, nearest and bilinear. nearest copies the nearest data point value (so has the same functionality as np.repeat); bilinear generates the new values by interpolating between two data points (similar to np.linspace). In the following example, we want to enlarge the image with a 2 height_factor and a 2 width_factor:

x = tf.constant([[[[1,  2,  3,  4],
                   [5,  6,  7,  8],
                   [9, 10, 11, 12]],
                  [[-1, -2, -3, -4],
                   [-5, -6, -7, -8],
                   [-9,-10,-11,-12]]]])
print(K.int_shape(x))
data_format = "channels_first"
height_factor, width_factor = 2, 2
interpolation = "bilinear"
a = K.resize_images(x, height_factor, width_factor, data_format, interpolation)
print(K.int_shape(a))
K.get_value(a)
(1, 2, 3, 4)
(1, 2, 6, 8)
array([[[[  1. ,   1.5,   2. ,   2.5,   3. ,   3.5,   4. ,   4. ],
         [  3. ,   3.5,   4. ,   4.5,   5. ,   5.5,   6. ,   6. ],
         [  5. ,   5.5,   6. ,   6.5,   7. ,   7.5,   8. ,   8. ],
         [  7. ,   7.5,   8. ,   8.5,   9. ,   9.5,  10. ,  10. ],
         [  9. ,   9.5,  10. ,  10.5,  11. ,  11.5,  12. ,  12. ],
         [  9. ,   9.5,  10. ,  10.5,  11. ,  11.5,  12. ,  12. ]],

        [[ -1. ,  -1.5,  -2. ,  -2.5,  -3. ,  -3.5,  -4. ,  -4. ],
         [ -3. ,  -3.5,  -4. ,  -4.5,  -5. ,  -5.5,  -6. ,  -6. ],
         [ -5. ,  -5.5,  -6. ,  -6.5,  -7. ,  -7.5,  -8. ,  -8. ],
         [ -7. ,  -7.5,  -8. ,  -8.5,  -9. ,  -9.5, -10. , -10. ],
         [ -9. ,  -9.5, -10. , -10.5, -11. , -11.5, -12. , -12. ],
         [ -9. ,  -9.5, -10. , -10.5, -11. , -11.5, -12. , -12. ]]]], dtype=float32)

taless474 avatar Mar 21 '19 17:03 taless474

map_fn, foldl and foldr have two common arguments, fn (callable) and elems. Here is an example of how foldl is used:

import numpy as np
from phylanx import Phylanx, PhylanxSession, execution_tree

PhylanxSession.init(1)

def variable(value, dtype=None, name=None):
    return execution_tree.var(np.array(value, dtype=dtype))
	
def eval(func):
    return func.eval()

@Phylanx
def foldl_eager(fn, elems, initializer):
	return fold_left(fn, initializer, elems)

def foldl(fn, elems, initializer=None):
	return foldl_eager.lazy(fn, elems, initializer)
	
	
	
def test_foldl():
	x = np.random.rand(10, 3).astype(np.float32)
	kx = eval(foldl(lambda a, b: a + b, variable(x)))

	assert (3,) == kx.shape
	assert_allclose(x.sum(axis=0), kx, atol=1e-05)
	
test_foldl()

taless474 avatar Mar 27 '19 22:03 taless474

@taless474 The problem with this code is that it passes a pure Python function to the Phylanx decorator. We will have think on how to support this kind of construct. I don't think we will be able to transform this into a PhySL expression at runtime (@rtohid ?).

What we perhaps could do is to allow for pure Python code to be executed from inside Phylanx, i.e. create a primitive that invokes some pre-compiled piece of Python (calls back into the Python interpreter).

hkaiser avatar Mar 27 '19 23:03 hkaiser

Keras switch is basically an element-wise if_conditional. If the condition has lower dimensions in comparison to then and else, the condition will be broadcasted (it will be reshaped to (oldshape, 1) or (oldshape, 1, 1) ,... before broadcasting). Check NumPy implementation here.

from keras import backend as K

x = K.constant([1,-2])
then_expr = K.constant([[10,20],[30,40]])
else_expr = K.constant([[-10,-20],[-30,-40]])
a = K.switch(K.greater_equal(x,0.5), then_expr,else_expr)
K.get_value(a)

results in:

array([[ 10.,  20.],
       [-30., -40.]], dtype=float32)

taless474 avatar Apr 02 '19 16:04 taless474

We have discussed this and came to the conclusion that switch is very similar to the already existing where primitive, except for the broadcasting rules to be applied to the conditional. Also, switch assumes that the two result arrays have the same shape.

hkaiser avatar Apr 02 '19 21:04 hkaiser