ergm
ergm copied to clipboard
Reduce dependencies
As a fan of your work, I am concerned about the number of dependencies (direct and indirect) that ergm
has. As of July 13th, 2020, the tinyverse dependencies checker lists 10 direct and 27 indirect dependencies . As you may know, having too many dependencies could be bad for various reasons, but mostly because of the fact that things tend to break more often (e.g. update on package
xyz
breaks on CRAN, thus breaking ergm
).
From what I've checked in the code, there are at least of a couple of packages that could be avoided by ergm
, but before I started to work on a PR trying to reduce the number of dependencies, I wanted to check with you if you think this is worth it.
More info about dependencies here.
PS: Dependencies on the other statnet packages
statnet
network
networkDynamic
ergm.userterms
EpiModel
tergm
Hey! Thanks for the offer to help. My philosophy has generally been to use library packages where practical, but I do appreciate the argument for minimalism. Which dependencies did you have in mind?
From the current version, I see:
- robustbase 1/1
- coda 0/0
- trust 0/0
- Matrix 0/0
- lpSolve 0/0
- parallel na
- methods na
- MASS 0/0
- statnet.common 1/1
,
- rle na
- purrr 2/2
- rlang 0/0
- tibble 10/14
So I would try to see if we could remove tibble from the dependencies. Furthermore, a quick check reveals that it seems that you are only using the tibble::lst
function in a few places here. Perhaps that could be replaced with a do.call()
. I would need to check more details, but if that's the case, the number of direct and indirect dependencies could drastically decrease.
Thanks for the suggestion. My sense is that at the moment, there is no point removing dependence on tibble
, because network
, on which ergm
necessarily depends, depends on it as well. (The reason it uses tibble
is that tibble
doesn't mangle column names or add row names unexpectedly, as well as handling list columns a bit better.) It might be worth looking into data.table
or some other replacement for that.
Not sure what the value would be in replacing one package (tibble) with another (data.table) if the goal is to reduce dependencies.
@martinamorris , data.table
doesn't have any major dependencies of its own, whereas tibble
does.
Thanks for the suggestion. My sense is that at the moment, there is no point removing dependence on
tibble
, becausenetwork
, on which ergm necessarily depends, depends on it as well. (The reason it usestibble
is thattibble
doesn't mangle column names or add row names unexpectedly, as well as handling list columns a bit better.) It might be worth looking intodata.table
or some other replacement for that.
data.table
could be a better alternative as it has no dependencies whatsoever . IMHO, tidy-tools are great for data wrangling and fast prototyping as there is less chance of error, yet I think software development is a bit different as coding here is done more carefully and I think a few extra lines of code should be OK.
Parts of network affected by tibble are:
Essentially, turning networks to tibble objects. On the bright side, and as suggested in network/Issue 9 (which I was not aware of before submitting this issue), tibble can instead be added to the Suggests
field so in case that anything happens to it, network
and friends can still work OK, and more importantly, stay on CRAN (see CRAN policies and Writing R Extensions).
I am still not sure how big a priority this is, but https://github.com/pharmaR/riskmetric might provide some guidance about which packages are actually worth worrying about depending on.
Interesting tool! I'll definitely try it out.
Perhaps the priority is somewhat lessened by the idea to split ergm into component packages? Worth revisiting during the process.
I think @gvegayon point on putting some of the tidyverse deps as Suggests quite sensible.