Travis J

Results 1 issues of Travis J

The optional algorithm for GELU is to internally use tanh See more here: https://pytorch.org/docs/stable/generated/torch.nn.GELU.html#torch.nn.GELU I was expecting this to just work: var gelu = nn.GELU(approximate: "tanh"); When the approximate argument...