HugeCTR icon indicating copy to clipboard operation
HugeCTR copied to clipboard

[Question] Custom Loss Functions and Storing embeddings in CPU

Open goru001 opened this issue 3 years ago • 2 comments

Thanks for doing great work with Nvidia Merlin. I had two questions as I wasn't able to find relevant pointers in docs.

  1. How should one go about using, say BPR or other pairwise loss functions? From docs it seems only cross entropy is supported?
  2. What if we can't afford multiple GPUs to store embeddings and need to offload Embeddings to CPU RAM. Is there a way to leverage Embedding Cache, Parameter server with Pytorch model which offloads embeddings to CPU RAM?

goru001 avatar May 25 '21 06:05 goru001

Hi @goru001 , thank you for your questions! More loss and layers are definitely we are going to support. Would you like to elaborate the full model you are using together with Pairwise loss functions? and what's the other pairwise loss functions you are interested in? If you can suggest the link of paper or blog that could be very helpful.

We have Embedding cache on both inference and training but the mechanism behind are different. For inference, it is just like you said. Embedding is stored in CPU memory and GPU only cache the hot part. link

For Training, we have Model oversubscription instead of storing embedding on CPU memory, we put them to storage: notebook

zehuanw avatar May 26 '21 12:05 zehuanw

Thanks @zehuanw for responding. It would be great to see support for Pairwise loss functions like BPR (Paper), Hinge loss (Link).

Let me go through the links you've shared for the second part and get back with any questions!

goru001 avatar Jun 02 '21 05:06 goru001