Integrate previous version of `asdfhjkl` directly into Laplace
First version integrating the old asdfghjkl into Laplace directly. This allows to have it alongside the latest asdl and further integrate existing extensions, such as end-to-end differentiability and support for other loss functions. This version does not change any behavior and only integrates functions of asdfghjkl that are actually used. The only exception to this is kernel.py, which is currently not yet used but will be worth integrating.
Points to discuss:
- What about documentation for
asdfghjkl_src? - Should we first make sure
asdfghjklcan be default by enabling regression? - If
asdfghjklbecomes default, what is a sensible way to integrate it? I think we could havecurvature/asdfor the core utilities and merge the interfaces/default backends incurvature.pywithasdfghjkl.py, deprecate theAsdfghjklXYZclasses and just go withGGN, EF, Hessianfor them.
Thanks for the progress!
-
Documentation: Since we are going to integrate it into
laplace's core, we should followlaplace's code standard, in particular, tests, typehinting and docstrings. I think the last two are enough for documentation. -
Minimum required features: I vote to make it default since all the backends currently have one limitations and the others (https://github.com/aleximmer/Laplace/issues/203). To do so, I believe these are the requirements:
- [ ] Support for
regression - [ ] Support for backdrop through the Jacobian & GLM predictive (for continuous Bayesian optimization)
- [ ] Support for LoRA + Huggingface models + reward modeling. The keys here are: (I think these are automatically supported, at least this is the case for the new
ASDL. In any case, we need to provide tests for these.)- [ ] Support for arbitrary inputs (
x: Union[Tensor, MutableMapping[str, Any]]). - [ ] Support for subset-of-weights (the Hessian & Jacobians are computed only for parameters with
requires.grad == True).
- [ ] Support for arbitrary inputs (
- [ ] Support for
-
Integration: I agree with you. We can put the forked
asdfghjklcode in something likelaplace.curvature.default_backendand use it directly inlaplace.curvature.CurvatureInterfaceto provide a default implementations. So, if another backend doesn't implement a functionality, it will be covered by this default backend. Note that currently this default functionality is provided bytorch.funcwhich is not efficient.
Thanks, sounds good to me. I think asdfghjkl is well suited to be the universal backend we need and is much faster than torch.func for Jacobians etc. I will take care of the points you mentioned.