ergm
ergm copied to clipboard
Change MPLE C code and ergm.pl() to return (#edges, #nonedges, covariate vector) instead?
At this time, we represent the information needed to obtain the MPLE in three objects:
- Covariates: u*p-matrix of doubles (change statistic vector for the dyad)
- Response: u-vector of integers (1 = dyad present, 0 = dyad absent)
- Weights: u-vector of integers (number of dyads with that particular Covariates and Response cobination)
Here, u = the number of distinct Covariates-Response configurations, and p = the number of parameters in the model.
An alternative representation, incidentally accepted by glm()
, is
- Covariates: u'*p-matrix of doubles (change statistic vector for the dyad)
- Failures/Successes: u'*2-matrix of integers (numbers of dyads with edges absent and present)
Here, u' is the number of distinct configurations of Covariates only. This could as much as halve the number of rows that have to be stored by the MPLE algorithm.
The downside of this approach is that the deviance returned by glm()
would not be correct, because it would include the binomial coefficient. For example, here are the three representations:
# 0-1 LHS, no "compression":
yI <- c(0,0,1,1)
glm(yI~1, family="binomial")
# 0-1 LHS, "compressed":
yW <- c(0,1)
wW <- c(2,2)
glm(yW~1, family="binomial", weights=wW)
# Two-column LHS, compressed:
yC <- rbind(c(2,2))
glm(yC~1, family="binomial")
Observe that the MLEs are the same for all three (0), but the deviances of the first two are different from the third one. This difference is 2*log(choose(4,2))
.