egor icon indicating copy to clipboard operation
egor copied to clipboard

Minimise hard dependencies of egor

Open krivit opened this issue 4 years ago • 26 comments

Since I am primarily using egor as a data storage and manipulation tool, having it also depend on igraph, tidygraph, as well as network means that a lot of unnecessarily functionality gets pulled in every time I install it, and it's now interfering with testing the package, since some of those have odd system requirements.

I would therefore like us to try to minimise its hard dependencies (Depends and Imports) by moving them to Suggests and Enhances. In particular,

  1. tidygraph is only used in one place: the activate() generic. If there are no objections, I am going to ask them to contribute that function to the generics package.
  2. igraph is only used for some data conversion and summary functions. I think those could be wrapped in if(require("igraph")).
  3. network is only used in conversions and so can be similarly wrapped.

The other packages are either the ubiquitous Tidyverse packages or critical to the core functionality (e.g., srvyr).

An alternative is to split egor into two packages, e.g., a a minimalist egorCore and egor which depends on egorCore.

krivit avatar Jan 30 '21 11:01 krivit

It sounds reasonable to me, to make the dependencies as slim as possible. I also noticed, that testing on non-windows systems is very slow. But I would also like to have some time to take a look at it. My spontaneous reaction is that tidygraph (if activate() would be in generics) and network are not that essential. igraph on the other hand is a big part of the visualization functions.

Do you have an idea of which of the three packages are causing most of your issues and removing them from Depends would benefit you most? And, just out of interest, what issues do you have and on which systems?

tidygraph is only used in one place: the activate() generic. If there are no objections, I am going to ask them to contribute that function to the generics package.

Yes, please go ahead!

Another option I see could be to have egor and egorVis, I considered that before. But I do prefer a solution where it is just one package.

tilltnet avatar Jan 30 '21 19:01 tilltnet

Since I am primarily using egor as a data storage and manipulation tool, having it also depend on igraph, tidygraph, as well as network means that a lot of unnecessarily functionality gets pulled in every time I install it, and it's now interfering with testing the package, since some of those have odd system requirements.

I would therefore like us to try to minimise its hard dependencies (Depends and Imports) by moving them to Suggests and Enhances. In particular,

  1. tidygraph is only used in one place: the activate() generic. If there are no objections, I am going to ask them to contribute that function to the generics package.

  2. igraph is only used for some data conversion and summary functions. I think those could be wrapped in if(require("igraph")).

  3. network is only used in conversions and so can be similarly wrapped.

I'd add

  1. Shiny and it's dependencies:
Imports: utils, grDevices, httpuv (>= 1.5.2), mime (>= 0.3), jsonlite (>= 0.9.16), xtable, digest
           (>= 0.6.25), htmltools (>= 0.5.0.9001), R6 (>= 2.0), sourcetools, later (>= 1.0.0),
           promises (>= 1.1.0), tools, crayon, rlang (>= 0.4.9), fastmap (>= 1.0.0), withr,
           commonmark (>= 1.7), glue (>= 1.3.2), bslib (>= 0.2.2.9002), cachem, ellipsis, lifecycle
           (>= 0.2.0)

An alternative is to split egor into two packages, e.g., a a minimalist egorCore and egor which depends on egorCore.

I am sympathetic to carving-out the storage and it's immediate facilities as a separate package and have egor with all it's ego-network analysis capabilities to sit on top of it. That would be similar to the role of network within Statnet.

mbojan avatar Jan 31 '21 02:01 mbojan

As an experiment, I tried eliminating shiny as a hard dependency in the suggest_shiny branch. It required a lot of ::s, but it does not raise any check warnings.

krivit avatar Jan 31 '21 09:01 krivit

Added some namespace fixes. @tilltnet , @mbojan , mind if I merge this into master?

Also, a question: why did we put in the as_igraph() and as_network() generics, as opposed to just implementing the as.igraph() and as.network() methods?

krivit avatar Feb 01 '21 22:02 krivit

Added some namespace fixes. @tilltnet , @mbojan , mind if I merge this into master?

Not at all.

Also, a question: why did we put in the as_igraph() and as_network() generics, as opposed to just implementing the as.igraph() and as.network() methods?

There are "aliases":

  • https://github.com/tilltnet/egor/blob/a07f68c3a3a9881fc914e92155ee5dead278bc57/R/conversions.R#L138
  • https://github.com/tilltnet/egor/blob/a07f68c3a3a9881fc914e92155ee5dead278bc57/R/conversions.R#L264

I'm not exactly sure anymore why there are as_* generics in the first place...

mbojan avatar Feb 02 '21 01:02 mbojan

Let's go ahead with the merge for now. I'm not sure about the removal of igraph though. @krivit does igraph on its own have a big impact on your testing process?

Also, a question: why did we put in the as_igraph() and as_network() generics, as opposed to just implementing the as.igraph() and as.network() methods?

At the time I mainly did it this way, as it makes the documentation more easy to find. The documentation for as.igraph.egor() would basically be impossible to find for most casual R users, as they only call as.igraph() and most certainly don't know about the existence of as.igraph.egor(). But this is a general issue I see with methods which deviate from the arguments of the generic.

tilltnet avatar Feb 02 '21 14:02 tilltnet

@tilltnet , merged.

igraph does have an impact, since it has some non-standard build dependencies and build system.

Regarding the as_igraph and as_network, why not just alias as.igraph and as.network in the help files for the methods? That way, a user typing ?as.igraph will get a choice of igraph's help and egor's help.

krivit avatar Feb 02 '21 21:02 krivit

Anyway, I've filed the ticket against tidygraph, though I don't know when the maintainer will address it.

krivit avatar Feb 03 '21 06:02 krivit

In case tidygraph maintainers do not prove amenable, it seems feasible to make it Suggest-ed as well (since functions internal to egor can call activate.egor() directly).

I've prototyped it in a branch (suggest_tidygraph). No functionality is lost, but to use activate(), the end-user would need to run library(tidygraph) first.

Any thoughts?

krivit avatar Feb 08 '21 09:02 krivit

I really hope that the tidygraph maintainers are willing to donate the activate() generic to generics, otherwise I'd probably prefer to split egor up into two packages. The activate() strategy for data manipulation is an essential part of egor and I expect it to be preferred by most new users. It should be available without loading any additional packages.

tilltnet avatar Feb 08 '21 23:02 tilltnet

I am not particularly happy with not reexporting activate() either. I'll give it another week, and then ping the author.

krivit avatar Feb 08 '21 23:02 krivit

The problem, also, is that given how integral activate() is to data manipulation, it ought to go into the class part of the package, not the summaries part of the package.

krivit avatar Feb 08 '21 23:02 krivit

Good news and bad news.

  • Good news: the author of tidygraph is happy to contribute activate() to generics.
  • Bad news: the author of generics isn't interested in taking it.

I wonder if we should fork generics to create, e.g., graph.generics or, actually, create a package more.generics, which does what generics does but has a more liberal policy towards what generics get included.

krivit avatar Feb 09 '21 23:02 krivit

I saw that generics mostly consists of tidyverse generics. As far as I know tidygraph is not an official tidyverse package, so it makes sense, that they won't take it.

I do like the idea of a very liberal generics package a lot! libre.generics would be one suggestion for a name.

tilltnet avatar Feb 09 '21 23:02 tilltnet

* Bad news: the author of `generics` isn't interested in taking it.

Wow... Just wow... I thought that (quoting the README) "generics is designed to help package authors reduce dependencies by providing a set of generic methods that can be imported". Apparently not.

I think let's just define activate() as a generic within egor itself with the same signature as it is in tidygraph, i.e.:

tidygraph::activate
function (.data, what) 
{
    UseMethod("activate")
}

Nevermind a benign conflict with a package that is not needed otherwise.

mbojan avatar Feb 10 '21 12:02 mbojan

@mbojan , it doesn't work that way. See https://github.com/lme4/lme4/issues/566#issuecomment-658431987 .

But, yes, it does seem inconsistent with the mission statement of the package.

krivit avatar Feb 10 '21 20:02 krivit

@mbojan , it doesn't work that way. See lme4/lme4#566 (comment) .

Right... Although here the conflict will be in generics not in methods so I think not really identical to that simulate.formula problem. The dispatch gets distorted anyway though.

How about these steps:

  1. Suggest tidygraph
  2. Implement and export egor::activate() generic with as it is in tidygraph: activate <- function(.data, what) UseMethod("activate")
  3. Implement and register egor::activate.default() calling tidygraph::activate()like this:
activate.default <- function(.data, what) {
  if(requireNamespace("tidygraph")) {
    w <- rlang::enquo(what)
    tidygraph::activate(.data, what = !!w)
  }
}

Thanks to that (I think):

  • activate() can be called without tidygraph installed
  • if tidygraph is loaded, all methods should be found whichever package, egor or tidygraph is loaded first.

did not test though...

mbojan avatar Feb 10 '21 22:02 mbojan

It looks very feasible. Question: Will it still work if tidygraph gets libraryed after egor?

krivit avatar Feb 10 '21 23:02 krivit

I think what @mbojan suggests could be a temporary fix, if it works well enough. If this is urgent to you @krivit, it would be the quickest fix I think.

For mid- to long-term: I am going to start a package called tbllst, that makes the activate() logic available for any list of tibbles. If we can convince Thomas from tidygraph to make his package get the activate() generic from that package, there won't be any namespace issues. I imagine he'd be open to do so, since he mentioned that he planned for activate() to get a more general purpose.

I created a repository here and will fill it with notes and some code in the next few days. Let me know if you want to join in.

tilltnet avatar Feb 11 '21 00:02 tilltnet

It looks very feasible. Question: Will it still work if tidygraph gets libraryed after egor?

I think so... I'll experiment on/off the suggest_tidygraph branch.

Yes. It's sort of hacky, e.g. I think there is a risk of getting caught in an infinite dispatch loop... Still, if one knows what one's doing...

mbojan avatar Feb 11 '21 01:02 mbojan

I think what @mbojan suggests could be a temporary fix, if it works well enough. If this is urgent to you @krivit, it would be the quickest fix I think.

For mid- to long-term: I am going to start a package called tbllst, that makes the activate() logic available for any list of tibbles. If we can convince Thomas from tidygraph to make his package get the activate() generic from that package, there won't be any namespace issues. I imagine he'd be open to do so, since he mentioned that he planned for activate() to get a more general purpose.

I created a repository here and will fill it with notes and some code in the next few days. Let me know if you want to join in.

That sounds good, if only we can get Thomas on board.

BTW do we really need to Depends on dplyr and tibble? Even dplyr does not Depends on tibble.

mbojan avatar Feb 11 '21 01:02 mbojan

BTW do we really need to Depends on dplyr and tibble? Even dplyr does not Depends on tibble.

Indeed. I think that the package should Suggests both tibble and dplyr and export methods conditionally. (It turns out to be straightforward: take a look at how egor handles igraph and network at the moment.)

Generally, almost everyone has Tidyverse installed these days, so I am not too concerned about tibble, but it seems like the right thing to do. Something as enormous as igraph should certainly be avoided unless absolutely necessary.

@tilltnet , it seems like we can recycle a lot of the egor code for this new package.

krivit avatar Feb 11 '21 03:02 krivit

@tilltnet @krivit try-out the default_method_trick branch with the fix.

mbojan avatar Feb 11 '21 08:02 mbojan

@mbojan , coming back to this, it seems to work, and I've folded it into the suggest_tidygraph branch, but I get the following R check warning:

* checking S3 generic/method consistency ... WARNING
Warning: declared S3 method 'activate.egor' not found
See section ‘Generic functions and methods’ in the ‘Writing R
Extensions’ manual.

krivit avatar May 16 '21 10:05 krivit

It looks very feasible. Question: Will it still work if tidygraph gets libraryed after egor?

This ^^^ is to blame, so your suspicions were correct @krivit . In the vignettes we have library(tidygraph) so tidygraph::activate() gets loaded and masks egor::activate(), and tidygraph::activate() is not finding egor::activate.egor() because of, as far as I can tell, how S3 dispatch works.

I'm not sure at this moment if there is a workaround to the workaround I proposed earlier.

mbojan avatar May 18 '21 01:05 mbojan

If the tidygraph folks are willing to coordinate, they could put mirroring code into their package and dispatch to us when appropriate.

A more general solution that doesn't require everybody to coordinate with everybody who wants a particular generic (because n(n-1)/2) would be for everybody to implement a GEN.default() which uses findFunction() to locate other generics with the same name and try one of those (with some provisions for not getting into an infinite loop).

This functionality could even be implemented in a package, i.e., polite.generics, though at that point, it might be easier to just create a more.generics package and get everyone to use it.

krivit avatar May 19 '21 04:05 krivit