Venturecxx icon indicating copy to clipboard operation
Venturecxx copied to clipboard

Even single-site enumeration gives (slightly) incorrect stationary distribution when detaching affects the result of enumerateValues

Open axch opened this issue 8 years ago • 3 comments

[This is a variant of #462]. Specifically, consider single-site Gibbs on a CRP choice that produced a singleton pre-Gibbs. The current Gibbs implementation will

  • Ask that CRP what the possible return values for it are, via enumerateValues.
  • enumerateValues will list all atoms returned by all currently-incorporated CRP applications, including itself, and add one unique atom representing the "new table" possibility
  • Gibbs will detach the application
  • Gibbs will try all the above values
  • and pick one according to its posterior probability.

The problem is that if that particular customer was sitting alone pre-Gibbs, the state "that customer sits alone" will appear in the list twice: once as the "new atom" and once as one of the existing atoms, namely the one that was already there. This behavior, in turn, leads to overdispersion relative to the true distribution.

There is a (disabled) test case here https://github.com/probcomp/Venturecxx/commit/4d5e212ef62a1fec13334981ee748f023b4c7f70 .

[Edited 11/9/16 to make this description stand alone, rather than relying on you, dear reader, knowing #462]

I worry that just tweaking Gibbs to detach first will not necessarily work either. Consider three CRP applications A, B, and C. Suppose at some point they are all distinct: A = 1, B = 2, C = 3. Suppose inference moves B to the same table as A, producing A = B = 1, C = 3. Suppose we now want to enumerate C. The current implementation will propose 1, 3, 4, which double-counts the "alone at a table" state, and is wrong. Detaching C first will propose 1, 2, which is almost right, except that it won't be able to use regen-restore to evaluate the 2 state. This is because the newValues==currentValues check in enumerative Gibbs will not trigger (see #631). Failing to use regen-restore is a problem because inference will then fail to follow Algorithm 8 from Neal 2000 (see the comment https://github.com/probcomp/Venturecxx/blob/a68cbeb5589407968e35426cda341542ee370f47/backend/lite/infer/egibbs.py#L60-L67).

axch avatar Mar 20 '16 23:03 axch

For CRP in particular, this can be solved by setting nextTable to the last table index that was vacated by detaching. The contract then is that if you detach an SP application, then ask it to enumerate values at the inputs, it should include the previous output.

lenaqr avatar Mar 21 '16 18:03 lenaqr

Could work for single-site. Will not generalize to block without a stack: consider detaching the state A = B = C = 1, D = 3, E = 5, for block enumeration of D and E.

axch avatar Mar 21 '16 19:03 axch

Should be fixed in Lite by #647.

axch avatar Nov 14 '16 23:11 axch