crosscat icon indicating copy to clipboard operation
crosscat copied to clipboard

Crosscat hyperprior grid on variance parameter is broader than it needs to be

Open alxempirical opened this issue 8 years ago • 9 comments

Crosscat adopts a uniform hyperprior over the parameters of the Normal Gamma prior distribution on the Normals (see footnote 1, p. 8.) For the "variance" parameter, this is done by constructing a grid from roughly 0 to sum((x-x̄)^2) in [construct_continuous_specific_hyper_grid](https://github.com/probcomp/crosscat/blob/6dadb9b33f7111449d5daf5683a1eac6365431a4/cpp_code/src/utils.cpp#L434}. The largest variance which makes sense for a sample {x} is max((x-x̄)^2), though. Since this is a grid of 31 elements, we're potentially losing a fair bit of precision here, and may be able to tighten up convergence a bit by tightening this bound.

alxempirical avatar Mar 29 '16 21:03 alxempirical

Sounds reasonable to me. What about the lower bound? .01*sum((x-x̄)^2) as we currently use seems pretty arbitrary to me -- surely there could be clusters with much smaller variance than that, very far away from one another.

riastradh-probcomp avatar Mar 29 '16 21:03 riastradh-probcomp

You could specify 0 as a lower bound for log_linspace, and you would get a grid starting at the smallest positive normal floating-point number.

riastradh-probcomp avatar Mar 29 '16 22:03 riastradh-probcomp

I suppose the smallest distance between any pair would be a better starting point, but that takes O(n^2) time to compute.

riastradh-probcomp avatar Mar 29 '16 22:03 riastradh-probcomp

I guess you can do it in O(n*log(n)) by sorting, since closest pairs will be adjacent in the sorted list.

alxempirical avatar Mar 30 '16 15:03 alxempirical

Right -- I was inexplicably thinking of >1-dimensional spaces. That would probably be a reasonable thing to do, then.

riastradh-probcomp avatar Mar 30 '16 15:03 riastradh-probcomp

I think you can probably even do closest points in high-dimensional spaces with a KD-tree.

alxempirical avatar Mar 30 '16 16:03 alxempirical

Oh, there's a wikipedia page about this exact problem.

alxempirical avatar Mar 30 '16 16:03 alxempirical

Golly, my memory of computational geometry has rotted.

However, 0 may also nevertheless be a reasonable choice to start anyway -- for isolated outlying clusters we don't have a reasonable lower bound on their variance.

riastradh-probcomp avatar Mar 30 '16 16:03 riastradh-probcomp

Yes, the current lower bound seems likely to cause a problem.

alxempirical avatar Mar 30 '16 17:03 alxempirical