recommenders icon indicating copy to clipboard operation
recommenders copied to clipboard

[Question] How to create embeddings from one-hot-encoded features?

Open almirb opened this issue 4 years ago • 1 comments

Hello!

In my model I have user_id , user_gender, user_age (for query) and movie_id (for candidate) features.

  1. How to deal with situations where I have a feature with only few options like user_gender (M, F, -)? What would be the best way to create an embedding for It?
  2. Creating a StringLookup with a 3 item vocab and then an embedding with dimension 32 would be to much?
  3. How to insert one-hot-encoding this feature when creating the embedding?

Thanks!

almirb avatar Dec 22 '21 13:12 almirb

I think it is not good idea to use embedding for low dimensional features , just make one-hot encoding vector using CategoryEncoding

self.make_categorical_gender =  tf.keras.Sequential([
tf.keras.layers.StringLookup(
                vocabulary=number_of_tokens, mask_token=None),

tf.keras.layers.CategoryEncoding(
                        num_tokens = number_of_tokens , output_mode ='one_hot', dtype='float32' )]
)

then concatenate that vector to user embedding vector in UserModel call method

def call(self, inputs): 
        if  self.use_context:
            return tf.concat([
                self.user_embedding(inputs["user_id"]), 
                self.make_categorical_gender(inputs['user_gender']),   
            ], axis=1)

You can find more information here

AzizIlyosov avatar Dec 23 '21 00:12 AzizIlyosov