sschoenholz

Results 18 comments of sschoenholz

Great question! Unfortunately, at the moment we don't have a mechanism for weight sharing. Right now, the best you can do is, as you describe, use the kernel function compute...

Hi there! Sorry for the delay. I'm not totally familiar with transductive learning in the GP setting. I will note that after the `stax.Aggregate` layer the kernel will be of...

Thanks for raising this question and for the clear repro. I haven't yet looked into the [0,1] issue, but I have investigated the NaNs. Note that for deep Erf networks...

Hey Rylan, I'm not totally sure, but let's see if we can work something out. I wrote up a short note on my interpretation of your problem [here](https://drive.google.com/file/d/1Z7n11SLRzDDgShIj80pXmMRPnG65HWkN/view?usp=sharing), let me...

Just to add to Jaehoon's reply, one thing we are interested in testing out is integration with the excellent GPyTorch package (https://gpytorch.ai/) which can scale GP inference to 1M+ datapoints....

Great question! A few points. 1. You were on the right track with setting `is_gaussian=True`. Notice that `post_half` doesn't have any dense layers and so if the inputs to it...

Thanks for taking the time to try out NT and raise this issue! I think it is likely that a layer-wise scheme for computing the NTK will be more memory...

Thanks for following up! I've been digging into the code and profiling. While I don't have a solution yet, here are some comments on your investigations: 1. I think this...

Ok! So I think I may have made some progress. I would like to understand why NT is slower than the sample you provided and then, separately, think about other...

Thanks for adding the check! You're clearly correct and you've come up with a super clever method! For my own sanity, I'll have to do some digging to figure out...