clojisr icon indicating copy to clipboard operation
clojisr copied to clipboard

clojisr stack

Open awb99 opened this issue 4 years ago • 4 comments

I have reflected a bit on the "clojisr stack"

So you have the the following layers:

  • engine layer (renjin or rserv)
  • r layer (execute r expressions in r language; get back RObjects incl RObject-clj conversion, depends on engine
  • codegen layer (which generates clj functions) depends on r-layer

Users ideally want to use only codegen layer, to write as clojure like as possible. However, there might be scenarios when you want to operate on the r-interop layer; say when you want to add existing r scripts.

I think that therefore perhaps you want to expose to the user the following namespaces:

r.engine  ; r.engine.rserv r.engine.renjin
r.clj      ; RObject r->clj  clj->r
r.core    ; codegen

I think the name clojisr is great; but since clojisr will be used in notebook and for data-science, the namespaces should be short. It is pretty clear that when you have an r namespace in clj, that there has to be some interop involved, so just making the namespace longer does not add value. clojisr should stay as the name of the project in github, and on clojars; I just think that the namespaces should be r. The r namespace thing, this is the minor point.

The mayor point is, that the user should understand at which level in the stack they are operating on.

Perhaps to you it is already obvious; but to me as a user it currently is not.

awb99 avatar Jun 06 '20 20:06 awb99

This separation also might also make sense for development; you will not by accident add dependencies from the lower elements in the stack to the higher ones by mistake. Say the r.engine namespace cannot refer to the r.core namespace; because r.core depends on r.engine.

awb99 avatar Jun 06 '20 20:06 awb99

@awb99 these are nice thoughts. Clear separation and nice namespaces are important.

Some concerns:

  • need to think whether this is the separation we want (things might change a bit during the current refactoring)
  • breaking existing user code
  • overlapping the behaviour or require-r, that creates r.* namespaces too

daslu avatar Jun 08 '20 10:06 daslu

I believe that we can expose only one namespace as the entry point for users, currently: clojisr.v1.r and move there require-r function. The rest namespaces are implementation layers and shouldn't be used by a user.

Regarding changing the namespace - I have no problem with longer namespaces.

We also have some number of primitive functions exposed in the clojisr.v1.r - like r+, r** etc.... Maybe we can make them as just +, ** without r alias?

genmeblog avatar Jun 08 '20 11:06 genmeblog

I would remove version number from the Namespace. Have not seen practice like that anywhere. If there is a breaking version change, then this must be documented somewhere, but Version dependent namespaces - no!

Long Namespaces: I would go with as short as possible. "r" or "clojisr". Even "clojisr.r" is already long. For normal clojure Libraries I agree 100% that Namespaces can be long. It is code that you write and unit test well; and it is not worth the risk of having namespace conflicts. But in notebook usecase, or repl usecase .. I don't want to type so much every time I do a notebook. And more importantly I don't want to see it all the time in which I am working in a notebook.

One thing that I will use in my notebooks is (require-r pinkgorilla.r/default-config). I am not sure if this is currently supported or not. What I noticed is that every notebook with clojisr code has half a page of overhead in terms of require-r (and others). Having one or two default require lists is sensible.

awb99 avatar Jun 09 '20 19:06 awb99