recommenders
recommenders copied to clipboard
index_from_dataset returns indices rather than movie names
Hello
In my code I added another feature to the candidate tower in addition to movie title (for each movie I have a vector representation of that movie which is precalculated using some other algorithms) and just feed it directly to the candidate tower.
interactions_dict = a dictionary
ratings = tf.data.Dataset.from_tensor_slices(interactions_dict)
movies = ratings.map(lambda x: {
'movie_title' : x['movie_title'],
'movie_vector' : x['movie_vector'],
})
index = tfrs.layers.factorized_top_k.BruteForce(model.query_model,k=CANDIDATE_POOL_SIZE)
index.index_from_dataset(movies.batch(100).map(lambda x: model.candidate_model(x)))
query_dict = {'user_id':tf.constant([user]),
'user_vector':np.stack([user_vector])}
Any idea why title after running _, titles = index(query_dict)
contains indices rather than the actual movie names?
Here is the call method in my candidate tower:
def call(self, titles):
return tf.concat([
self.title_embedding(titles["movie_title"]),
titles["movie_vector"]
], axis=1)
hi, @abdollahpouri .
In order for index_from_dataset
to return the movie name instead of the index, we need to enter a tuple of (candidate identifier, candidate embedding) .
https://www.tensorflow.org/recommenders/api_docs/python/tfrs/layers/factorized_top_k/BruteForce
It should look like this:
index.index_from_dataset(
movies.batch(100).map(lambda x: (x['movie_title'], model.candidate_model(x)))
)
Hi, @abdollahpouri ,
How can you add the movie vector to the dataset? I would like to add a vector to the user model as well. Please advice, thank you.
@zhifeng-huang You can use it as a feature straight feed into your model. If the movie vector is already a pre-calculated feature that can be fed into the model as it is, then you can do the following:
class QueryModel(tf.keras.Model):
def __init__(self,layer_sizes):
super().__init__()
self.movie_embedding = tf.keras.Sequential([
tf.keras.layers.StringLookup(
vocabulary=unique_movie_ids, mask_token=None),
tf.keras.layers.Embedding(len(unique_movie_ids) + 1, 32),
])
def call(self, inputs):
return tf.concat([
inputs["movie_vector"],
self.movie_embedding(inputs['movie_id']),
], axis=1)