sbi icon indicating copy to clipboard operation
sbi copied to clipboard

handle deterministic input with SNLE

Open flo-schu opened this issue 2 years ago • 2 comments

Assume I have a model that takes deterministic input $X$ and unknown parameters $\theta$.

Is it possible to train an SLN estimator on the parameters and consider the observations i.i.d conditional on the input $X$? Is this a problem, which can be tackled with SNLE?

flo-schu avatar Aug 06 '22 13:08 flo-schu

Hi @flo-schu let's call your input $Y$. If $Y$ is fixed for all parameters, it should be no problem. You would learn an emulator for simulator(theta, input=Y), i.e., train NLE with theta and single-trial observations x_i. During inference with MCMC the observation can then be multiple i.i.d. X={x_1, ..., x_n}. See e.g., https://www.mackelab.org/sbi/tutorial/14_multi-trial-data-and-mixed-data-types/

janfb avatar Aug 08 '22 09:08 janfb

Hi @janfb, I am not sure if I understand the assumption correctly

If $Y$ is fixed for all parameters...

In my case $Y$ is not fixed and observation $x_i$ depends on input $y_i$. Something like

$X$ $Y$ trial
5 1 1
10 2 2
20 4 3
... ... ...

I guess what I want to do is something like a regression but where the model relates input $Y_i$ to observation $X_i$, but the set of parameters are common to all observations.

In such a scenario I guess I could simulate different $Y$ during training and then do inference on the observations and try to recover the true $Y$, but it seems to be not the best way. Do I miss something?

I apologize if these questions are not well formulated, but I'm struggling to wrap my mind around it :)

flo-schu avatar Aug 08 '22 13:08 flo-schu

I see, your Ys are like experimental conditions, deterministic but different for different trials?

I would then train the emulator to emulate Y as well, e.g., the emulator takes inputs (Y, parameters) and learns to output the corresponding x. During inference, you would need to write your own potential function for MCMC. This function fixes x AND Y to the values of your actual observed data / experiments, and returns the likelihood value for a given parameter.

janfb avatar Oct 07 '22 07:10 janfb

I see, your Ys are like experimental conditions, deterministic but different for different trials?

exactly.

I would then train the emulator to emulate Y as well, e.g., the emulator takes inputs (Y, parameters) and learns to output the corresponding x.

I have also thought along this line.

During inference, you would need to write your own potential function for MCMC. This function fixes x AND Y to the values of your actual observed data / experiments, and returns the likelihood value for a given parameter.

This is the step I was missing. Thank you, this makes a lot of sense.

flo-schu avatar Oct 12 '22 14:10 flo-schu

Hi @janfb,

I need to warm this up again. I will implement this soon and have a question regarding this:

During inference, you would need to write your own potential function for MCMC. This function fixes x AND Y to the values of your actual observed data / experiments, and returns the likelihood value for a given parameter.

If I understand you correctly, I would substitute theta (sampling parameters and experimental conditions) with theta that samples only parameters and keeps experimental conditions fixed in this function call:

https://github.com/mackelab/sbi/blob/cd10570fba86ec801f2e2ea1b0511c56f27a01e6/sbi/inference/potentials/likelihood_based_potential.py#L91-L97

Is this what you meant?

flo-schu avatar Mar 08 '23 08:03 flo-schu

Hi @flo-schu ,

yes, that's how I would do it. During NLE training, you train your emulator with a proposal distribution for theta and experimental conditions c: theta, c ~ q(theta)q(c); and then during inference, you have a prior only on theta ~ p(theta).

The potential function could have an additional attribute c that is fixed, just like x_o. During the __call__ to the potential function, a new theta is passed, and here https://github.com/mackelab/sbi/blob/cd10570fba86ec801f2e2ea1b0511c56f27a01e6/sbi/inference/potentials/likelihood_based_potential.py#L140 you would probably need to stack together the new theta with the fixed c to evaluate the emulator net.

This is just from the top of my head, not sure this will work just like that. But I would be happy to work this through together, e.g., eventually make a PR with a new potential function class that implements this feature?

Best, Jan

janfb avatar Mar 08 '23 08:03 janfb

But I would be happy to work this through together, e.g., eventually make a PR with a new potential function class that implements this feature?

Hi @janfb, I just realized that I've never answered to your reply. Definitely I'm interested in implementing this feature together. Starting in June I should have time for this. I'll get back to you then.

flo-schu avatar Apr 29 '23 09:04 flo-schu

closed by #829

janfb avatar Feb 16 '24 10:02 janfb