Laplace icon indicating copy to clipboard operation
Laplace copied to clipboard

Integrate previous version of `asdfhjkl` directly into Laplace

Open aleximmer opened this issue 1 year ago • 2 comments

First version integrating the old asdfghjkl into Laplace directly. This allows to have it alongside the latest asdl and further integrate existing extensions, such as end-to-end differentiability and support for other loss functions. This version does not change any behavior and only integrates functions of asdfghjkl that are actually used. The only exception to this is kernel.py, which is currently not yet used but will be worth integrating.

Points to discuss:

  • What about documentation for asdfghjkl_src?
  • Should we first make sure asdfghjkl can be default by enabling regression?
  • If asdfghjkl becomes default, what is a sensible way to integrate it? I think we could have curvature/asd for the core utilities and merge the interfaces/default backends in curvature.py with asdfghjkl.py, deprecate the AsdfghjklXYZ classes and just go with GGN, EF, Hessian for them.

aleximmer avatar Sep 01 '24 18:09 aleximmer

Thanks for the progress!

  • Documentation: Since we are going to integrate it into laplace's core, we should follow laplace's code standard, in particular, tests, typehinting and docstrings. I think the last two are enough for documentation.
  • Minimum required features: I vote to make it default since all the backends currently have one limitations and the others (https://github.com/aleximmer/Laplace/issues/203). To do so, I believe these are the requirements:
    • [ ] Support for regression
    • [ ] Support for backdrop through the Jacobian & GLM predictive (for continuous Bayesian optimization)
    • [ ] Support for LoRA + Huggingface models + reward modeling. The keys here are: (I think these are automatically supported, at least this is the case for the new ASDL. In any case, we need to provide tests for these.)
      • [ ] Support for arbitrary inputs (x: Union[Tensor, MutableMapping[str, Any]]).
      • [ ] Support for subset-of-weights (the Hessian & Jacobians are computed only for parameters with requires.grad == True).
  • Integration: I agree with you. We can put the forked asdfghjkl code in something like laplace.curvature.default_backend and use it directly in laplace.curvature.CurvatureInterface to provide a default implementations. So, if another backend doesn't implement a functionality, it will be covered by this default backend. Note that currently this default functionality is provided by torch.func which is not efficient.

wiseodd avatar Sep 02 '24 18:09 wiseodd

Thanks, sounds good to me. I think asdfghjkl is well suited to be the universal backend we need and is much faster than torch.func for Jacobians etc. I will take care of the points you mentioned.

aleximmer avatar Sep 04 '24 10:09 aleximmer