Denis Vorotyntsev
Denis Vorotyntsev
Hello. I made a benchmark of categorical encoders for several datasets and I found that double validation (i.e. KFold within train data) is a must for target-based encoders. I provided...
Thank you for feedback. The repo is still somewhat raw and I will add more info about versions and experiments settings later, but the results could be used already. About...
>It would also be interesting to explain why a single internal cross-validation improves AUC of BackwardDifferenceEncoder (surprising) when it does not improve AUC of HelmertEncoder and SumEncoder (not surprising). It...
>Empty braces Corrected. > What is n? Say in the article. And include reference. It was introduced in the beginning of the section. > Shouldn't the frequencies in FrequencyEncoder be...
You may use similarly encoding for multiple values categories: [paper](https://arxiv.org/abs/1806.00979), [code](https://github.com/dirty-cat/dirty_cat)
Embedding is a lookup table that is used to map indexes to tensors. It's commonly used in tabular tasks with categorical values. tf/torch implementations are - [torch](https://pytorch.org/docs/stable/generated/torch.nn.Embedding.html) [tf](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Embedding) Embedding tensors...
Yes, I think it's the correct understanding. I don't know how to fix it in pytorch, but I'd look how it's implemented in Adam optimizer first (https://pytorch.org/docs/stable/_modules/torch/optim/adam.html#Adam). Torch Adam and...
+1 There are several models in https://github.com/robvanvolt/DALLE-models Some examples of how to fine-tune and use them would be much appreciated.