feature_engine icon indicating copy to clipboard operation
feature_engine copied to clipboard

entity embedding for categorical features with large cardinality

Open csetzkorn opened this issue 3 years ago • 3 comments

I think this would be useful. I would know how to do this using TF + Keras. Not sure if this could be used?

csetzkorn avatar Oct 25 '22 17:10 csetzkorn

Hi @csetzkorn

thanks for creating the issue. Could you add some links or more details regarding what this functionality is about?

solegalli avatar Oct 26 '22 06:10 solegalli

The technique tends to be used in the nlp world and is know there as word embedding. Here are some links:

  • https://gitlab.com/praj88/deepembeddings/-/blob/master/Scripts/deepEmbeddings_Keras.ipynb
  • https://medium.com/@roeibahumi/keras-regression-with-categorical-variable-embeddings-dfc28616e7fe
  • https://mmuratarat.github.io/2019-06-12/embeddings-with-numeric-variables-Keras
  • https://www.youtube.com/watch?v=EATAM3BOD_E&list=RDLVOuNH5kT-aD0&index=4

csetzkorn avatar Oct 26 '22 07:10 csetzkorn

Thank you @csetzkorn !

solegalli avatar Oct 26 '22 08:10 solegalli