Mamba.jl icon indicating copy to clipboard operation
Mamba.jl copied to clipboard

ABC sampler

Open gragusa opened this issue 7 years ago • 13 comments

I am using Mamba to experiment with the ABC sampler. I wondering whether there is a way to skip the simulation of a node because at a given value of the parameters the simulation cannot be carried out without error.

The real problem I am working is fairly complicated so I will use a toy example to explain my question better.

I have

type D <: ContinuousUnivariateDistribution  
   a::Int
   b::Int
end

for which I have implemented rand methods. I have also a prior type for a and b with insupport methods since the priors have bounded support. The prior are non-standard so I implemented link and invlink so that list and relist will work. And indeed they do.

The problem is that for values of a and b the prior is 0 (not in support of the prior --- which coincides with the support of D) but the rand(::D) is called anyway. Given how involuted is the problem this causes problems. Of course, I can create a sentinel value and do some ad hoc thing, but would it be better if the sampler simply will reject a draw for value of the parameters that are not in the support of the prior, or probably even better, implement a insupport method for D and reject these draws right the way when inuspport(D, a, b) == false.

gragusa avatar Jul 15 '16 15:07 gragusa

A few questions: D in your above example is the distribution of the observed data and a and b are the parameters for which you wish to obtain posterior samples is that correct?

Does this happen for initial values, or during the sampling process?

bdeonovic avatar Jul 16 '16 14:07 bdeonovic

Yes this is correct. On Sat, 16 Jul 2016 at 16:20, Benjamin Deonovic [email protected] wrote:

A few questions: D in your above example is the distribution of the observed data and a and b are the parameters for which you wish to obtain posterior samples is that correct?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/brian-j-smith/Mamba.jl/issues/97#issuecomment-233132505, or mute the thread https://github.com/notifications/unsubscribe-auth/AAfRgvhNC-QxSy2Zetak7ZBMApmLcyuFks5qWOiTgaJpZM4JNhmu .

gragusa avatar Jul 16 '16 16:07 gragusa

The second method sounds easy to implement. I'll make a pull request tomorrow when I have some time (visiting family today)

bdeonovic avatar Jul 16 '16 18:07 bdeonovic

In the toy example, insupport(D, a, b) would not be a valid test of a and b, because their supports are defined by their prior distributions rather than by D.

If D is getting assigned values of a and b that are not in their supports, then that is the root cause of the problem. The ABC sampler generates candidate draws on the real line. The invlink functions should be defined so as to map those draws back to the non-zero-probability support of the corresponding parameters. It sounds like this is not happening. I would suggest looking into modifications of the invlink functions to fix the problem or other steps that could be taken to ensure that D does not get assigned out-of-support parameter values.

brian-j-smith avatar Jul 17 '16 01:07 brian-j-smith

I think instead of insupport(D,a,b) he means checking whether D is a valid distribution with values a and b. For example if you had a Poisson distribution and tried giving it a negative intensity. Ideally the prior on your parameters should not allow this (for Poisson you would just give intensity a prior that is only positive on positives). It sounds like the user is not able to specify these restrictions in the prior.

bdeonovic avatar Jul 17 '16 04:07 bdeonovic

I agree that the an link invlink methods could suffice. However, as I said, I work on non standard problems. A reduction would be the following. The prior on a and b are marginal uniform but joint prior is U(0,1) x U(0, 1) x f(a,b), for instance f(a,b) = 1(a+b<1). I then write a link invlink based on the marginal uniform priors while insupport is written for the actual prior. In this case rand(D(a, b)) would be called by the sampler even for values of a and b that have zero prior probability.

So, my problem stems from my inability to write link invlink methods that are general enough ( I probably could in some cases, but in general and for the class of problems I am dealing with, the mapping between the restricted space and R would be too computationally intensive. On the other hand, having a insupport(D, a, b) clause to simulate would be quite cheap.

On Sun, 17 Jul 2016 at 06:06, Benjamin Deonovic [email protected] wrote:

I think instead of insupport(D,a,b) he means checking whether D is a valid distribution with values a and b. For example if you had a Poisson distribution and tried giving it a negative intensity. Ideally the prior on your parameters should not allow this (for Poisson you would just give intensity a prior that is only positive on positives). It sounds like the user is not able to specify these restrictions in the prior.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/brian-j-smith/Mamba.jl/issues/97#issuecomment-233164023, or mute the thread https://github.com/notifications/unsubscribe-auth/AAfRgqJ3fADaUtamCwhCYaham8btKPfvks5qWaoygaJpZM4JNhmu .

gragusa avatar Jul 17 '16 08:07 gragusa

Compatibility with the Distributions package must be maintained. insupport is imported from that package, where it is defined as a test of whether a value is in the distributional support, not the parameter support. Redefining insupport to mean something else (i.e. whether the distributional specification is valid) would break compatibility. Hence, that solution is not going to work.

A couple alternative suggestions:

  1. Implement the distribution D so as to return an ArgumentError if its constructor is called with invalid values of a and b. Modify the ABC sampler code to incorporate a try/catch statement to reject a candidate draw if it produces this error. ArgumentError is what the Distributions constructors return if called with invalid parameter values. I am not very fond of this solution because there are many other coding errors that can produce an ArgumentError and would be masked by the try/catch statement. If the Distributions package had implemented a custom error type for invalid distributional parameters, then this would be a good solution.
  2. Define a multivariate distribution for a and b on which the prior 1(a + b < 1) is specified. Modify the ABC sampler code to reject a candidate draw if the prior evaluation is zero. I don't know how feasible this is in your problem, since the toy example seems to not incorporate all the complexities of the problem; but this would be my preferred solution.

brian-j-smith avatar Jul 17 '16 21:07 brian-j-smith

  1. If the try/catch block were only around the distribution constructor do we really need to worry about other possible sources for an ArgumentError? This seems to be a reasonable solution.

bdeonovic avatar Jul 18 '16 13:07 bdeonovic

The distributional constructors are in the user-defined Stochastic nodes. A node is the smallest model component that the package could place a try/catch block around, and nodes can include more than just distributional constructors.

brian-j-smith avatar Jul 19 '16 02:07 brian-j-smith

@brian-j-smith I think your option 2) is what I was thinking of, but a try/catch statement would probably be neater provided an parameter exception would exist. I will try to raise the issue over to the Distributions.jl people let's whether they are willing to consider this.

On Tue, 19 Jul 2016 at 04:24, Brian J Smith [email protected] wrote:

The distributional constructors are in the user-defined Stochastic nodes. A node is the smallest model component that the package could place a try/catch block around, and nodes can include more than just distributional constructors.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/brian-j-smith/Mamba.jl/issues/97#issuecomment-233513464, or mute the thread https://github.com/notifications/unsubscribe-auth/AAfRgmG_IRUWBhHhhfjmQym0c9x8qOFOks5qXDVkgaJpZM4JNhmu .

gragusa avatar Jul 19 '16 14:07 gragusa

@brian-j-smith Ah right, I was thinking this would be possible inside of the ABC sampler, but you are right this would have to be in the user specified Model, so there is not much we can do.

bdeonovic avatar Jul 19 '16 14:07 bdeonovic

Thanks for the input. There is a modification in the master branch now to first check that a candidate draw is in the prior distribution support before doing any further calculations with it. A custom exception would be cleaner, so I'll keep an eye out for any developments on that in the Distributions package. I think the exception below would be nice to have. It looks like they would only need to make a change to their check_args() macro.

type ImproperDistributionError <: Exception
  msg::AbstractString
end

brian-j-smith avatar Jul 20 '16 00:07 brian-j-smith

Thank you! I saw the extra three lines in master --- but I assume they were part of the new tagged version.

gragusa avatar Jul 20 '16 22:07 gragusa