hmf
hmf
Questions regarding `softmax`. I was coding the [cross_entropy](https://pytorch.org/docs/stable/generated/torch.nn.functional.cross_entropy.html) examples to make sure the typing is correct. In the second example we need the `softmax` function in the link below. Looking...
> That's fine but could you give the following variant a try? It's a solution we already use in other places and avoids both implicits and multiple parameter lists (at...
While trying to replicate the Colaboratory notebook to check the code is working, I tried to do the following: ```scala // We want x[b,t] = mean_{i
Found some compiler weirdness with the changes above.These do not compile: ```scala xbow(Seq(b,t)) = torch.mean(input=xprev, dim=0) xbow(Seq(b,t)) = torch.mean(xprev, dim=0) ``` The error is: ```shell method mean in trait ReductionOps:...
@sbrunk Changes work fine. Thanks.
I need the use of [Dropout](https://pytorch.org/docs/stable/generated/torch.nn.Dropout.html?highlight=dropout#torch.nn.Dropout). In Python this seems to return a constructor of sorts (did not check), which can then be applied to a `Tensor`. I see that...
I would like to use [register_buffer](https://pytorch.org/docs/stable/generated/torch.jit.ScriptModule.html#torch.jit.ScriptModule.register_buffer). According to the Python API doc, we must pass in a name. Looking at the `org.bytedeco.pytorch.Module` we have: ```java public Tensor register_buffer(BytePointer name, Tensor...
While trying to implement and debug the multi-head attention mechanism, I have what seems to be unexpected behavior. For a model with the multi-head "only", the code: ```scala val nuParams...
@sbrunk I have confirmed that I need to register the inner modules. As for the macro, maybe a single function that traverses the sub-modules and registers them would do. But...
I would like to give an update on this *endeavor*. I have gone through most of the video and am now at the start of the "Block" implementation. I have...