AlphaBetaGamma96 comments

Results 42 comments of


                                            AlphaBetaGamma96

Feature request: fast way to approximate the diagonal of the hessian

@fuzihaofzh Is the hessian with respect to the inputs of your model or the parameters of your model? There is a trick for the inputs but none exists (to my...

Feature request: fast way to approximate the diagonal of the hessian

Hi @fuzihaofzh, I came across this repo (https://github.com/amirgholami/adahessian) and thought you might find it interesting as it uses a trick for the Hessian via Hutchinson's estimator.

backward hooks aren't called when using vmap

I'm working on some gradient preconditioning techniques and it requires the forward activations and backward sensitivities (grad_output) of all nn.Module objects of a network. I am also calculating per-sample gradients...

backward hooks aren't called when using vmap

@zou3519 Do you know if it's possible to compute the `grad_output` of a layer via `vmap` or is it only possible via hooks?

backward hooks aren't called when using vmap

Hi @albanD, thanks for the insight on how to use `register_hook`! Just so I understand this correctly, I can register a hook on a Tensor and use my `forward_pre_hook` and...

backward hooks aren't called when using vmap

Hi @albanD, I had a quick look at `register_hook` but it seems that the signature of that hook only takes the gradient of a Tensor rather than the `grad_output` values...

backward hooks aren't called when using vmap

So, what I need is the gradient of the output of a module (which is what full_backward returns, although doesn't work with functorch atm) but after checking `register_hook` it seems...

backward hooks aren't called when using vmap

That's a neat trick! So this would basically do what `full_backward_hook` does but for an `nn.Module` object with only 1 output and works with functorch? That kinda looks like what...

No Batching rules for aten::_linalg_solve_ex, aten::linalg_solve, aten::linalg_solve_ex, and aten::_linalg_slogdet causes significant slowdown for per-sample gradients with torch.linalg.slogdet

No need to apologize for the delay @samdow, thanks for solving the batch rule! Fingers crossed it all works!

No Batching rules for aten::_linalg_solve_ex, aten::linalg_solve, aten::linalg_solve_ex, and aten::_linalg_slogdet causes significant slowdown for per-sample gradients with torch.linalg.slogdet

Hi @samdow, thanks for fixing this issue! A bit of a silly question, but I remember reading somewhere that functorch is being merged directly into pytorch (if that's the correct...