motoko-base icon indicating copy to clipboard operation
motoko-base copied to clipboard

Feat: Binary relation collections (from CanCan).

Open matthewhammer opened this issue 4 years ago • 14 comments

Binary relation representation for base library, as:

  • Module with purely-functional representation (Rel)
  • Module with OO wrapper and useful idioms for a canister state's relations (RelObj), as in CanCan.

See comments in committed code for more details.

Before merging:

  • [ ] some unit tests for each new module
  • [ ] complete doc for each module.

Opening this PR now so that interested parties can see/use it while it's WIP. cc @gobengo

matthewhammer avatar Feb 10 '21 18:02 matthewhammer

What do you think about including example usage of these or, better, a test? (I don't know if it's even possible to add a test in motoko)

gobengo avatar Feb 11 '21 05:02 gobengo

Might be worth putting on our weekly meeting agenda…pingng @stanleygjones on this.

nomeata avatar Feb 12 '21 08:02 nomeata

I'd expect this kind of a data structure to live in a library instead so I can pull it in when I need it.

Fair points, @kritzcreek. Parts of me agree with you.

If you made this into a package, what would you call this library (of two modules)?

What would the intended scope of this library be, precisely? These two modules, only?

While parts of me agree, parts of me also disagree, especially as I try to answer such questions.

Here are my arguments for inclusion in base--- Let's discuss in a meeting, as @nomeata suggests.

I needed this structure for CanCan, which is a tiny, simple app. I also needed them for the Produce Exchange. So, in 2/2 cases, I used these structures (or similar ones) in the same way. That seems to argue in favor of inclusion into base.

Perhaps we can separate the issue of "binary relations in base" versus these particular modules in base?

I admit that these modules are not strictly necessary to implement these apps; they are required for me to implement them. I like to speak about problems in terms of their mathematical structure, and most apps will have some domain-specific relational structure, whether its modeled explicitly as a collection of binary relations using this API or not.

But if we want to represent binary relations (I argue that these belong in base, in some form), why not choose this implementation to start?

matthewhammer avatar Feb 12 '21 19:02 matthewhammer

@gobengo asks

What do you think about including example usage of these or, better, a test? (I don't know if it's even possible to add a test in motoko)

See the PR's unchecked to dos: One of them is (and has been) "some unit tests for each new module". That's our standard for each module in base, at this point.

As far as example usage, the other unchecked to do item above is "docs...".

matthewhammer avatar Feb 12 '21 19:02 matthewhammer

If you made this into a package, what would you call this library (of two modules)? What would the intended scope of this library be, precisely? These two modules, only? While parts of me agree, parts of me also disagree, especially as I try to answer such questions.

I'll prefix this by saying that not knowing a good library name for a piece of code is not an argument for its inclusion into base.

Now to be a little more constructive 😄

If you made this into a package, what would you call this library (of two modules)?

The rest of the world seems to call it a bimap:

  • https://hackage.haskell.org/package/bimap-0.4.0
  • https://docs.rs/bimap/0.6.0/bimap/
  • https://www.npmjs.com/package/bimap
  • https://guava.dev/releases/19.0/api/docs/com/google/common/collect/BiMap.html
  • https://www.boost.org/doc/libs/1_75_0/libs/bimap/doc/html/index.html

What would the intended scope of this library be, precisely? These two modules, only?

The Haskell, Rust, and JS libraries just contain the bimap data structure, the Java and C++ libraries are both examples of giant stdlib extensions.

I needed this structure for CanCan, which is a tiny, simple app. I also needed them for the Produce Exchange. So, in 2/2 cases, I used these structures (or similar ones) in the same way. That seems to argue in favor of inclusion into base.

I think the reason we need these relational structures more often for canisters than other ecosystems is to make good on our "no database" promise. Maybe there's a relational-algebra package waiting to be written that would include this bimap?

kritzcreek avatar Feb 12 '21 22:02 kritzcreek

The rest of the world seems to call it a bimap.

No.

A bidirectional map is not the same thing as a binary relation, though every bidirectional map is a (restricted kind of) binary relation.

Critically, the interesting relations on social graphs (follows, followers, etc) are not bimaps.

Your links are about the former. This PR is about the latter.

matthewhammer avatar Feb 12 '21 23:02 matthewhammer

Just to be clear: "bimaps" are for creating efficient bidirectional associations for the (very particular) case where there is a bijection between data in two datasets.

This PR is about a distinct case where directional (and thus bidirectional) associations are not unique. I can have several followers. I can follow several people.

This PR is about representing binary relations, generally (not invertable functions, specifically).

matthewhammer avatar Feb 12 '21 23:02 matthewhammer

Critically, the interesting relations on social graphs (follows, followers, etc) are not bimaps. Your links are about the former. This PR is about the latter.

Interesting! So this is even closer to a typical "n to m" relation in database schema design than I had thought.

Allright let's talk about it on Tuesday. I'm even more convinced now that this might be the start of a relational-algebra package now.

kritzcreek avatar Feb 12 '21 23:02 kritzcreek

You say "relational algebra" (which makes perfect sense), and I anticipate hearing "graph queries" and GraphQL once we start unpacking the problem.

I guess the value of Rel here is that it's just very modest. It's just for binary relations (directed graphs) without any edge data.

I agree that as we consider graphs (or "hyper graphs", for non-binary relations) and edge data, this all starts to become something like a graph database pretty quickly.

matthewhammer avatar Feb 12 '21 23:02 matthewhammer

Once we start to think about algebra, we can talk about differential operators, the change of the graph, the high order change of the graph, the incremental/reactive graph database :)

chenyan-dfinity avatar Feb 13 '21 06:02 chenyan-dfinity

Here's but one datapoint in the graphql tech space: https://hasura.io/ and their open source project: https://hasura.io/opensource/

matthewhammer avatar Feb 13 '21 14:02 matthewhammer

From https://graphql.org/

Describe what’s possible with a type system

Yeah, I think the GraphQL folks and I may have different definitions of that phrase means, and in particular, what actually qualifies (mathematically) as a "type system". But I appreciate the attempt at PL theory lip service, nevertheless.

matthewhammer avatar Feb 13 '21 14:02 matthewhammer

and I anticipate hearing "graph queries" and GraphQL once we start unpacking the problem.

Certainly not from me :D But even if we were to look at GraphQL, you'll see that Hasura implements it by compiling it to a bunch of SQL, so we'd probably want to start with that either way.

the incremental/reactive graph database :)

Graph databases have the nichest of use-cases in the real world (at least for web-apps) and are overused by people that are bored of Postgres working too well for them. GraphQL (despite its name) has nothing to do with graph databases... So I'd say a graph database project sounds interesting, but not related to this PR.

kritzcreek avatar Feb 15 '21 09:02 kritzcreek

Graph databases have the nichest of use-cases in the real world (at least for web-apps) and are overused by people that are bored of Postgres working too well for them. GraphQL (despite its name) has nothing to do with graph databases... So I'd say a graph database project sounds interesting, but not related to this PR.

Allright that was way too strongly worded... my apologies (it was a knee-jerk reaction after having to salvage people's data from MongoDBs in a previous job).

I think a Graph Database is really interesting, but I'd like us to start of by exploring a "simple" version of tables and Sql queries. Now I understand I can't just allocate your time however I want, so I started an experiment of my own. This way we can evaluate our approaches against different use-cases and maybe improve both of them :)

I'm starting to realize that maybe your approach is the more Motoko-ish way of going about it, because I constantly find myself wanting to access raw memory and casting data around :D

kritzcreek avatar Feb 16 '21 11:02 kritzcreek