Aria Ghora Prabono

Results 2 comments of Aria Ghora Prabono

@JulienSiems I got the same issue as @Ryul0rd. I think this is due to PyTorch implementation of `Linear`, in which it stores weights in transposed manner: ```python def __init__(self, ...)...

Hi, you're correct, and that is expected. That part is the one that I cannot really replicate due to incomplete details in the original paper. In fact, I used different...