EasyRec
EasyRec copied to clipboard
How to output topk items to recommend from DSSM model
Can you please provide an example config on how to output topk (say = 5) items after training the DSSM model? I am unable to find this information in the documentation.
you need use a vector recall engine, such as faiss。 use user embedding to query item embedding, then get topk items。
Thanks for your quick answer, much appreciated @poson! Faiss requires the full set of vectors; can you please let me know how to get this from EasyRec.
You can get all the vectors by predict (https://easyrec.readthedocs.io/en/latest/predict/MaxCompute%20%E7%A6%BB%E7%BA%BF%E9%A2%84%E6%B5%8B.html).
first, you should use this script(https://github.com/alibaba/EasyRec/blob/master/easy_rec/python/tools/split_model_pai.py) to split the model into two parts: user part and item part.
Thanks for the help. But I get an error:
File "/home/shiv/Documents/DataScience/EasyRec/easy_rec/python/tools/split_model_pai.py", line 16, in
And when I add to force it to use Tensorflow v1 by adding the following two lines:
if tf.version >= '2.0': tf = tf.compat.v1
I still don't go too far, please note I am using dssm_on_taobao.cfg.
Traceback (most recent call last):
File "/home/shiv/Documents/DataScience/EasyRec/easy_rec/python/tools/split_model_pai.py", line 271, in
Hi, I am also confused by this question. Could you please provide documents about split_model_pai.py and how to deploy retrieval model with only user tower and faiss?
what is your tf version @ss-github-code
@kinghuin documentation is in preparation, here are the steps:
- split_model_pai.py generates two saved_models, one for generating user embedding, and one for generating item embedding. The generated saved_model could be used for serving(using tensor-rt or PAI-EAS).
- after getting the saved_models, you can do offline prediction to get the user embeddings ( https://easyrec.readthedocs.io/en/latest/predict/MaxCompute%20%E7%A6%BB%E7%BA%BF%E9%A2%84%E6%B5%8B.html).
- for online prediction by deploying user model to PAI-EAS service(https://easyrec.readthedocs.io/en/latest/predict/%E5%9C%A8%E7%BA%BF%E9%A2%84%E6%B5%8B.html).
- Item embeddings are usually predicted offline similar to step 2 and are saved in MaxCompute tables, which could then be imported to Hologres(Proxima) for online KNN retrieval. You can also import the embedding vectors into other KNN engines(such as Faiss) for online retrieval.
Sorry for the delay. My tensorflow version is 2.8
the script could not work under tf2.x, we are fixing it, the reason is that the variable created in tf2.x are different from tf1.x
Could you provide a tf2 version script of this split function?