egor
egor copied to clipboard
Minimise hard dependencies of egor
Since I am primarily using egor
as a data storage and manipulation tool, having it also depend on igraph
, tidygraph
, as well as network
means that a lot of unnecessarily functionality gets pulled in every time I install it, and it's now interfering with testing the package, since some of those have odd system requirements.
I would therefore like us to try to minimise its hard dependencies (Depends
and Imports
) by moving them to Suggests
and Enhances
. In particular,
-
tidygraph
is only used in one place: theactivate()
generic. If there are no objections, I am going to ask them to contribute that function to thegenerics
package. -
igraph
is only used for some data conversion and summary functions. I think those could be wrapped inif(require("igraph"))
. -
network
is only used in conversions and so can be similarly wrapped.
The other packages are either the ubiquitous Tidyverse packages or critical to the core functionality (e.g., srvyr
).
An alternative is to split egor
into two packages, e.g., a a minimalist egorCore
and egor
which depends on egorCore
.
It sounds reasonable to me, to make the dependencies as slim as possible. I also noticed, that testing on non-windows systems is very slow. But I would also like to have some time to take a look at it. My spontaneous reaction is that tidygraph
(if activate()
would be in generics
) and network are not that essential. igraph
on the other hand is a big part of the visualization functions.
Do you have an idea of which of the three packages are causing most of your issues and removing them from Depends
would benefit you most? And, just out of interest, what issues do you have and on which systems?
tidygraph
is only used in one place: theactivate()
generic. If there are no objections, I am going to ask them to contribute that function to thegenerics
package.
Yes, please go ahead!
Another option I see could be to have egor
and egorVis
, I considered that before. But I do prefer a solution where it is just one package.
Since I am primarily using
egor
as a data storage and manipulation tool, having it also depend onigraph
,tidygraph
, as well asnetwork
means that a lot of unnecessarily functionality gets pulled in every time I install it, and it's now interfering with testing the package, since some of those have odd system requirements.I would therefore like us to try to minimise its hard dependencies (
Depends
andImports
) by moving them toSuggests
andEnhances
. In particular,
tidygraph
is only used in one place: theactivate()
generic. If there are no objections, I am going to ask them to contribute that function to thegenerics
package.
igraph
is only used for some data conversion and summary functions. I think those could be wrapped inif(require("igraph"))
.
network
is only used in conversions and so can be similarly wrapped.
I'd add
- Shiny and it's dependencies:
Imports: utils, grDevices, httpuv (>= 1.5.2), mime (>= 0.3), jsonlite (>= 0.9.16), xtable, digest
(>= 0.6.25), htmltools (>= 0.5.0.9001), R6 (>= 2.0), sourcetools, later (>= 1.0.0),
promises (>= 1.1.0), tools, crayon, rlang (>= 0.4.9), fastmap (>= 1.0.0), withr,
commonmark (>= 1.7), glue (>= 1.3.2), bslib (>= 0.2.2.9002), cachem, ellipsis, lifecycle
(>= 0.2.0)
An alternative is to split
egor
into two packages, e.g., a a minimalistegorCore
andegor
which depends onegorCore
.
I am sympathetic to carving-out the storage and it's immediate facilities as a separate package and have egor
with all it's ego-network analysis capabilities to sit on top of it. That would be similar to the role of network
within Statnet.
As an experiment, I tried eliminating shiny
as a hard dependency in the suggest_shiny
branch. It required a lot of ::
s, but it does not raise any check warnings.
Added some namespace fixes. @tilltnet , @mbojan , mind if I merge this into master
?
Also, a question: why did we put in the as_igraph()
and as_network()
generics, as opposed to just implementing the as.igraph()
and as.network()
methods?
Added some namespace fixes. @tilltnet , @mbojan , mind if I merge this into
master
?
Not at all.
Also, a question: why did we put in the
as_igraph()
andas_network()
generics, as opposed to just implementing theas.igraph()
andas.network()
methods?
There are "aliases":
- https://github.com/tilltnet/egor/blob/a07f68c3a3a9881fc914e92155ee5dead278bc57/R/conversions.R#L138
- https://github.com/tilltnet/egor/blob/a07f68c3a3a9881fc914e92155ee5dead278bc57/R/conversions.R#L264
I'm not exactly sure anymore why there are as_*
generics in the first place...
Let's go ahead with the merge for now. I'm not sure about the removal of igraph
though. @krivit does igraph on its own have a big impact on your testing process?
Also, a question: why did we put in the
as_igraph()
andas_network()
generics, as opposed to just implementing theas.igraph()
andas.network()
methods?
At the time I mainly did it this way, as it makes the documentation more easy to find. The documentation for as.igraph.egor()
would basically be impossible to find for most casual R users, as they only call as.igraph()
and most certainly don't know about the existence of as.igraph.egor()
. But this is a general issue I see with methods which deviate from the arguments of the generic.
@tilltnet , merged.
igraph
does have an impact, since it has some non-standard build dependencies and build system.
Regarding the as_igraph
and as_network
, why not just alias as.igraph
and as.network
in the help files for the methods? That way, a user typing ?as.igraph
will get a choice of igraph
's help and egor
's help.
Anyway, I've filed the ticket against tidygraph
, though I don't know when the maintainer will address it.
In case tidygraph
maintainers do not prove amenable, it seems feasible to make it Suggest
-ed as well (since functions internal to egor
can call activate.egor()
directly).
I've prototyped it in a branch (suggest_tidygraph
). No functionality is lost, but to use activate()
, the end-user would need to run library(tidygraph)
first.
Any thoughts?
I really hope that the tidygraph
maintainers are willing to donate the activate()
generic to generics
, otherwise I'd probably prefer to split egor
up into two packages. The activate()
strategy for data manipulation is an essential part of egor
and I expect it to be preferred by most new users. It should be available without loading any additional packages.
I am not particularly happy with not reexporting activate()
either. I'll give it another week, and then ping the author.
The problem, also, is that given how integral activate()
is to data manipulation, it ought to go into the class part of the package, not the summaries part of the package.
Good news and bad news.
- Good news: the author of
tidygraph
is happy to contributeactivate()
togenerics
. - Bad news: the author of
generics
isn't interested in taking it.
I wonder if we should fork generics
to create, e.g., graph.generics
or, actually, create a package more.generics
, which does what generics
does but has a more liberal policy towards what generics get included.
I saw that generics
mostly consists of tidyverse generics. As far as I know tidygraph
is not an official tidyverse package, so it makes sense, that they won't take it.
I do like the idea of a very liberal generics package a lot! libre.generics
would be one suggestion for a name.
* Bad news: the author of `generics` isn't interested in taking it.
Wow... Just wow... I thought that (quoting the README) "generics
is designed to help package authors reduce dependencies by providing a set of generic methods that can be imported". Apparently not.
I think let's just define activate()
as a generic within egor
itself with the same signature as it is in tidygraph
, i.e.:
tidygraph::activate
function (.data, what)
{
UseMethod("activate")
}
Nevermind a benign conflict with a package that is not needed otherwise.
@mbojan , it doesn't work that way. See https://github.com/lme4/lme4/issues/566#issuecomment-658431987 .
But, yes, it does seem inconsistent with the mission statement of the package.
@mbojan , it doesn't work that way. See lme4/lme4#566 (comment) .
Right... Although here the conflict will be in generics not in methods so I think not really identical to that simulate.formula
problem. The dispatch gets distorted anyway though.
How about these steps:
- Suggest
tidygraph
- Implement and export
egor::activate()
generic with as it is intidygraph
:activate <- function(.data, what) UseMethod("activate")
- Implement and register
egor::activate.default()
callingtidygraph::activate()
like this:
activate.default <- function(.data, what) {
if(requireNamespace("tidygraph")) {
w <- rlang::enquo(what)
tidygraph::activate(.data, what = !!w)
}
}
Thanks to that (I think):
-
activate()
can be called withouttidygraph
installed - if
tidygraph
is loaded, all methods should be found whichever package, egor or tidygraph is loaded first.
did not test though...
It looks very feasible. Question: Will it still work if tidygraph
gets library
ed after egor
?
I think what @mbojan suggests could be a temporary fix, if it works well enough. If this is urgent to you @krivit, it would be the quickest fix I think.
For mid- to long-term:
I am going to start a package called tbllst
, that makes the activate()
logic available for any list
of tibbles
. If we can convince Thomas from tidygraph
to make his package get the activate()
generic from that package, there won't be any namespace issues. I imagine he'd be open to do so, since he mentioned that he planned for activate()
to get a more general purpose.
I created a repository here and will fill it with notes and some code in the next few days. Let me know if you want to join in.
It looks very feasible. Question: Will it still work if
tidygraph
getslibrary
ed afteregor
?
I think so... I'll experiment on/off the suggest_tidygraph
branch.
Yes. It's sort of hacky, e.g. I think there is a risk of getting caught in an infinite dispatch loop... Still, if one knows what one's doing...
I think what @mbojan suggests could be a temporary fix, if it works well enough. If this is urgent to you @krivit, it would be the quickest fix I think.
For mid- to long-term: I am going to start a package called
tbllst
, that makes theactivate()
logic available for anylist
oftibbles
. If we can convince Thomas fromtidygraph
to make his package get theactivate()
generic from that package, there won't be any namespace issues. I imagine he'd be open to do so, since he mentioned that he planned foractivate()
to get a more general purpose.I created a repository here and will fill it with notes and some code in the next few days. Let me know if you want to join in.
That sounds good, if only we can get Thomas on board.
BTW do we really need to Depends
on dplyr
and tibble
? Even dplyr
does not Depends
on tibble
.
BTW do we really need to
Depends
ondplyr
andtibble
? Evendplyr
does notDepends
ontibble
.
Indeed. I think that the package should Suggests
both tibble
and dplyr
and export methods conditionally. (It turns out to be straightforward: take a look at how egor
handles igraph
and network
at the moment.)
Generally, almost everyone has Tidyverse installed these days, so I am not too concerned about tibble
, but it seems like the right thing to do. Something as enormous as igraph
should certainly be avoided unless absolutely necessary.
@tilltnet , it seems like we can recycle a lot of the egor
code for this new package.
@tilltnet @krivit try-out the default_method_trick
branch with the fix.
@mbojan , coming back to this, it seems to work, and I've folded it into the suggest_tidygraph
branch, but I get the following R check warning:
* checking S3 generic/method consistency ... WARNING
Warning: declared S3 method 'activate.egor' not found
See section ‘Generic functions and methods’ in the ‘Writing R
Extensions’ manual.
It looks very feasible. Question: Will it still work if
tidygraph
getslibrary
ed afteregor
?
This ^^^ is to blame, so your suspicions were correct @krivit . In the vignettes we have library(tidygraph)
so tidygraph::activate()
gets loaded and masks egor::activate()
, and tidygraph::activate()
is not finding egor::activate.egor()
because of, as far as I can tell, how S3 dispatch works.
I'm not sure at this moment if there is a workaround to the workaround I proposed earlier.
If the tidygraph
folks are willing to coordinate, they could put mirroring code into their package and dispatch to us when appropriate.
A more general solution that doesn't require everybody to coordinate with everybody who wants a particular generic (because n(n-1)/2) would be for everybody to implement a GEN.default()
which uses findFunction()
to locate other generics with the same name and try one of those (with some provisions for not getting into an infinite loop).
This functionality could even be implemented in a package, i.e., polite.generics
, though at that point, it might be easier to just create a more.generics
package and get everyone to use it.