tergm icon indicating copy to clipboard operation
tergm copied to clipboard

formation/dissolution formula extraction

Open chad-klumb opened this issue 3 years ago • 0 comments

Related to but distinct from #33

Currently in the EDA and when targets/monitor is specified by a character, the function .extract.fd.formulae is used to attempt to identify the formation, dissolution, and non-separable parts of the formula.

This is an imperfect process.  While it behaves as you would expect for formulas constructed in the separable wrappers, if you specify a formula in the general, non-separable manner and put things around Form or Diss (including offsets), they will be lumped in with the non-separable part of the formula.

This behavior is documented in ?.extract.fd.formulae, and I will be adding links to that documentation from the EDA and targets/monitor documentation.

It would be nice to have a better way to do this, but it is not clear to me that it can be done in general.  To begin with, one would need to define precisely what "the formation part of the formula" is.  If one has an input formula like ~Form(a) + Diss(b) + c, where a,b,c are cross-sectional ergm formulas, then it seems pretty reasonable to say that a is the formation part of the formula, b is the dissolution part of the formula, and c is the non-separable part of the formula.

Such definitions become less obvious if one has an arbitrary cross-sectional operator X and a formula such as ~X(~Form(a) + Diss(b) + c).  What then is the formation part of the formula?  For consistency with the previous definition, if we could rewrite ~X(~Form(a) + Diss(b) + c) as ~Form(u) + Diss(v) + w for some cross-sectional ergm formulas u,v, and w, then we might declare u to be the formation part of the formula.  But why can we always rewrite the formula this way, and how do we compute u,v, and w in terms of the inputs, and express them as formulas involving only terms that have actually been implemented?

To give a simpler example, let N denote the "negation" operator, that takes a network on a given node set to the network on the same node set with edges and non-edges interchanged.  One could then ask what is the formation part of ~N(~Form(~edges))?  It may be tempting to say it is ~N(~edges), but this is inconsistent with the "obvious" case first mentioned above, because we can equivalently write ~N(~Form(~edges)) as ~Diss(~N(~edges)).  From one perspective, our input formula looks like pure formation, and from the other it looks like pure dissolution.  If we could always reduce the formula to a "normal form", as suggested in the previous paragraph, we might avoid this ambiguity, but I doubt that is doable in practice.

chad-klumb avatar Jun 10 '21 01:06 chad-klumb