ad-delcont-primop icon indicating copy to clipboard operation
ad-delcont-primop copied to clipboard

Suggestion: benchmark against domain dimension

Open ocramz opened this issue 2 years ago • 1 comments

Very cool! Thank you for working on this :)

In ad-delcont I noticed that the performance dies off rapidly (exponentially?) as a function of domain dimension. As I have no intuition about the behavior of the new delcont primops, I would have to rely on benchmarks to measure this. What do you think?

ocramz avatar Jan 26 '23 04:01 ocramz

Hi, @ocramz! First of all, thank you for your implementation of ad-delcont and inspiring blog post!

In ad-delcont I noticed that the performance dies off rapidly (exponentially?) as a function of domain dimension. As I have no intuition about the behavior of the new delcont primops, I would have to rely on benchmarks to measure this. What do you think?

Yes, that is what this repository is currently missing. Recently, @Mikolaj, who is the author of horde-ad, also suggested to take large-scale benchmarks on tiwtter.

Although I don't have enough time to incorporate horde-ad into the benchmark suite, I have implemented some preliminary benchmarks in konn/large-bench branch. Currently, it includes only two benchmarks: gradients of 128-ary function and simple MNIST neural network. You can check the current results from CI log (e.g. https://github.com/konn/ad-delcont-primop/actions/runs/4036210590):

  • some randomly chosen 256-ary function:

    00

  • Simple MNIST benchmark

    00

The comparison with backprop is not included yet - it needs slightly more glue codes. It still lacks enough benchmarks, but from the current results, we can tell:

  • In larger-scale inputs, ad-delcont-primop suffers from huge performance decline - both in time and space.
  • Interestingly, implementation exploiting multi-prompt delcont to avoid mutable reference in intermediate layers asymptotically outperforms those with mutable references, although it is much slower than ad package.

After all, it is still a big challenge to make delcont-based AD asymptotically efficient. However, avoiding mutable references seems promising to explore the possibility. I want to explore that way further once I get the enough time!

konn avatar Jan 29 '23 10:01 konn