longlfuture

Results 1 issues of longlfuture

Thanks for your work!, since using kernel to fit the distribution of softmax, so why you choose MSE as loss function, instead of KL loss, can you give an explanation...