Results 26 comments of anhinga

Right :-) On Sat, Jan 8, 2022 at 5:52 PM Brian Chen ***@***.***> wrote: > Just to be clear, this is a question to the maintainers and not an >...

(lesson for me: not to reply via e-mail interface :-) )

@MikeInnes Thanks for all your work! In JAX one can compute gradients with respect to nested dictionaries, a simple example is in the README here: https://github.com/anhinga/jax-pytree-example I wonder how difficult...

@ToucheSir @MikeInnes Thanks! (My mistake was trying to coerce a dictionary into Params.)

(I actually think, it might be enough to provide `selectedNode` and `selectedEdge` in addition to `selectedNodeData` and `selectedEdgeData`, just like `tapNode` and `tapEdge` are provided in addition to `tapNodeData` and...

It looks like this is a real effect, and not a rendering inversion. It would be interesting to understand the reason for that...

My preliminary conjecture is that the reason might be the differences in regularization. Here the goal function does not seem to have any regularization: "forward" function has "return F.log_softmax(x, dim=1)"....

At PyTorch the recommended way to add L2 regularization is not via the loss function, but via weight decay parameter of an optimizer. So I am going to try to...

Yes, one needs a stronger regularization coefficient (1e-3, and not 1e-5), and then it works... I'll be posting further details during the next several days...

I have started to accumulate the notes and experimental Jupiter notebooks in a fork here: https://github.com/anhinga/synapses/blob/master/regularization.md