ranking
ranking copied to clipboard
Preprocessing example_lists with keras
Hi, Thanks for the great library!
I was trying to build a Keras model using the example in https://github.com/tensorflow/ranking/blob/master/tensorflow_ranking/examples/keras/keras_dnn_tfrecord.py.
But I'm stuck at a preprocessing step.
Let's say I have this feature spec for an example feature:
"FEAT_1": tf.io.FixedLenFeature(shape=(), dtype=tf.int32)
Then the corresponding InputLayer in keras will have shape (None, None). Now, this column contains integer data that I want to transform to a one-hot tensor of size 10 (so that the output shape is (None, None, 10)).
My preprocessing dict looks like:
{"FEAT_1": tf.keras.layers.CategoryEncoding(num_tokens=num_tokens, output_mode="one_hot")}
However, when I try to build my model, it looks like the CategoryEncoding only accepts inputs of rank 1: https://github.com/keras-team/keras/blob/master/keras/layers/preprocessing/preprocessing_utils.py#L114 so it complains that my input tensor is of size (None, None).
I've tried unstacking my input, or reshaping into a tensor of rank 1, but since the second dimension (the 'list_size' dimension) is variable, I cannot rebuild the tensor correctly. I get errors like Cannot infer argument 'num' from shape (None, None)
when I try to unstack)
(something like:
t_list = tf.unstack(t, axis=1)
onehots = []
for t_example in t_list:
onehot = tf.keras.layers.CategoryEncoding(
num_tokens=num_tokens,
output_mode="one_hot",
)(t_example)
onehots.append(tf.stack(onehots, axis=1))
)
Is there a solution to this? (Specifying a fixed list_size would probably work, but the whole point of tensorflow-ranking is to be able to use variable sizes example_lists conveniently) Should i somehow refer to the mask, with gather_nd?
I am facing the same problem you describe. Have you found a solution to this?