ergm icon indicating copy to clipboard operation
ergm copied to clipboard

BD/Strat proposals are not CD-compatible

Open chad-klumb opened this issue 3 years ago • 8 comments

I am not a CD expert, but there appear to be at least two reasons for this:

  • unlike MCMC and SAN, CD may call the U_FN with a dyad that was not just proposed in the P_FN; as a result, anything stored from the P_FN may not be valid when the U_FN is called
  • even adjusting for the previous bullet point, it appears that CD applies toggles in a way that may violate (at least transiently) the degree bound, making various BD code unsuitable for use in CD

A further consideration is that

  • UnsrtELs will be slow if subjected to random deletions

All things considered I would like to declare that the BD/Strat proposals do not (for the time being, anyway) support use with CD. The easiest way to do this would be to append a CD constraint internally when initializing a proposal destined for CD. (We would need to initialize a separate CD proposal when using CD, if we aren't doing that already.) Proposals that do support CD can add |CD to their proposal table entries to indicate the compatibility; the BD/Strat proposals would not add such an indication.

chad-klumb avatar Mar 04 '21 00:03 chad-klumb

Interesting. In principle, u_ functions should be able to handle any valid toggle, though I haven't thought about temporary violations of constraints. A mechanism to signal (and fall back?) may be necessary. Maybe .CD flag in the constraints may do the trick.

krivit avatar Mar 04 '21 00:03 krivit

You can check that constraints are violated by adding

MH_U_FN(Mu_TNT) {
  Rprintf("TNT_U_FN: tail head: %d %d\n", tail, head);
  Rprintf("TNT_U_FN: tail deg: %d\n", IN_DEG[tail] + OUT_DEG[tail]);
  Rprintf("TNT_U_FN: head deg: %d\n", IN_DEG[head] + OUT_DEG[head]);
  Rprintf("TNT_U_FN: edgeflag: %d\n", edgeflag);    
}

to MHproposals.c, installing the package, sinking output to a text file, running

require(ergm)
nw <- network.initialize(100, dir = F)
ergm(nw ~ edges, target.stats = 30, constraints = ~bd(maxout=1, minout = 0), estimate="CD")

and searching for instances of deg: 2 in the output. (The minout = 0 gives you TNT instead of one of the BD proposals.)

This may not be a problem for some proposals but in general a specialized data structure may not even be capable of representing the network state if it doesn't satisfy the appropriate constraints.

chad-klumb avatar Mar 04 '21 20:03 chad-klumb

One simple way to avert invalid states is to undo them in reverse order to the way they were done. I think the only line where that's not already being done is the following, and it should be straightforward to fix, though care should be taken about fence-post errors. https://github.com/statnet/ergm-private/blob/0e259e04c1c9d0c7691eae4098a9860ca10712fc/src/CD.c.template.do_not_include_directly.h#L264-L266

I wonder if the DyadGen API might also be affected. And, how much overhead would it involve to extend the data structures we use to maintain a stack of past indices and a way to signal to the proposal being initialised how many undos beyond its control it may expect.

krivit avatar Mar 04 '21 21:03 krivit

Other than UnsrtEL, what data structures rely on knowing the "last index"?

For UnsrtEL specifically I would just use HashELs if CD is being run (rather than start maintaining a longer history).

chad-klumb avatar Mar 04 '21 21:03 chad-klumb

@krivit this seems like another case of chasing things down that will delay the merge. for now, can we restrict CD to the proposals we know work? and chase down the solution later (assuming there is one)?

martinamorris avatar Mar 04 '21 22:03 martinamorris

@martinamorris , no, this is a very annoying issue, but we can fix it post-merge.

krivit avatar Mar 04 '21 22:03 krivit

I've added a mitigation that at least the invariants are not violated: the proposal should not have to deal with a state that violates its constraints (unless the constraint requires a multiplicitous proposals, such as edges).

The rest of it will probably have to wait until after the Great Merge, though @chad-klumb could chat about it in a few minutes.

krivit avatar Mar 04 '21 23:03 krivit

The immediate issue should be resolved by this block of code in the U_FN for BDStratTNT:

https://github.com/statnet/ergm/blob/816b1a4f9306d2ae777fd72938232b43773a7ab8/src/MHproposals.c#L313-L322

It effectively redoes the necessary work of the P_FN when running CD. (This is not done with MCMC or SAN because it would reduce efficiency.) It wouldn't hurt to add some tests.

chad-klumb avatar Jun 01 '21 02:06 chad-klumb

This was addressed awhile ago through the use of HashEL and the CD flag.

chad-klumb avatar Aug 16 '23 17:08 chad-klumb