TranSirius

Results 2 comments of TranSirius

It seems that one hot input is not that appropriate. You can view each dimension as a channel and when you use one hot vector as the input feature, it...

Without Attention can be viewed as a average aggregation scheme. You may rewrite the code and substitude the attentiton scheme based aggregation matrix with an average aggregation matrix or a...