remake icon indicating copy to clipboard operation
remake copied to clipboard

Auto generate document describing workflow from remake script.

Open dfalster opened this issue 10 years ago • 7 comments

Here's an idea, let's say you have a maker script and you want to write a document describing your analysis. Wouldn't it be nice if we could somehow generate that directly from the maker script.

It might include

  • names and type of objects created (vector, data.frame, list, file)
  • names of functions called
  • descriptions of either objects or functions, taken from roxygen description of function, or by parsing any commented preceding function
  • characteristics of standard object types, e.g. for data.frame report dimensions, names of columns, for lists report length.
  • dependency diagrams

The benefit of this approach would be that your documentation goes write along side your code, in form of comments in maker file or sourced R files. Being in the same file as the code, it's more likely to stay current.

Happy to brain storm some more about this if you think it's worth pursuing.

dfalster avatar Feb 05 '15 23:02 dfalster

This is a great idea. I actually thought we had something like this in the issue list already.

I'd be most interested in helping people capture the why (why is a particular target interesting). That could be output as tooltips on a graph, comments in the exported script, etc.

It's probably worth thinking about what is knowable by remake (no types for example) and what is knowable only at runtime (dimensions, etc).

richfitz avatar Feb 08 '15 10:02 richfitz

This would be really cool. Especially a dependency diagram. A practical advantage would be some kind of function that allowed you to specify what you were thinking of changing, and it would then emphasize the paths/ targets that would have to be updated if the change were made.

rBatt avatar Nov 08 '15 21:11 rBatt

Dependency diagram is available with remake::diagram at the moment, but there's plenty of scope for making that better.

richfitz avatar Nov 08 '15 22:11 richfitz

Sorry, just started using the package tonight. And I just saw diagram() ... I really like where it's going.

I know you probably have a lot of ideas of how it could be improved, but it would help me to see a dependency diagram that included the rules. I have lots of functions that depend on functions ... etc.

Great stuff though!

rBatt avatar Nov 09 '15 00:11 rBatt

At the moment I think a lot of the backend stuff is working well (or it least it works for my own projects) but the UI needs heaps of work. So ideas from actual users are more than welcome.

I'll definitely try and get to this next time I do a batch of work on remake (hopefully before the end of the year). For now, if you hover over the arrow tip you'll see the function that is being depended on. But more information could be squeezed in there especially if someone is more familiar with DiagrammeR.

I like the "what would change" idea.

richfitz avatar Nov 09 '15 12:11 richfitz

I am brand new to DiagrammeR, but maybe I'll play and learn a bit more.

I'm building (have built) a data package for my lab group --- a compilation of a massive amount of biological survey data. Previously everyone in the group had their own scripts for reading/ cleaning the data, and as a result everyone had slightly different data sets (and not everyone used the whole thing).

My goal with the package was to create a cohesive, central, and well-documented workflow. In my mind, remake is the best way to be sure that I was consistently keep all of the pieces up to date. From their end of things, being able to see "what depends on what" could be really important for them to trust/ learn to rely on the data package as a tool for their research.

So in that sense, remake is more than just a convenience of making sure things stay together and up-to-date; the act of having to describe dependencies in the .yaml, combined with the ability to diagram(), means that I can easily communicate the workflow, and users can see what code is resulting in the piece of product they're interested in.

So like I said, super cool work. Hopefully that dual use (practically it keeps everything up-to-date; but those diagrams are so great for communicating to a team) helps add perspective as to how I'm finding this package to be so promising and useful!

rBatt avatar Nov 09 '15 13:11 rBatt

Also, I just looked and I see what you mean by hovering over the arrows from diagram(); however, that only shows the Rule script. It'd be amazing to see the functions called by that Rule function, and on down. Such that anyone could trace what code was required to recreate such a node on that graph.

rBatt avatar Nov 09 '15 14:11 rBatt