funflow
funflow copied to clipboard
tackle the complicated workflows
Are you interested in developing a DSL on top of funflow to help user create complicated workflows?
Here is a real world workflow which is hard to specify by hand (because of many complex patterns). I've written a package which uses the Template Haskell to automate this process. See here for a minimal code example: https://github.com/kaizhang/SciFlow/blob/master/examples/wordcount.hs. I think it might be a good idea to use funflow as the engine to reimplement these features. I would like to hear your opinions.
We are definitely interested in this. At the moment we have an intern looking at whether Common Workflow Language might be a good language for a front-end DSL, mostly due to existing workflows and UI systems to build them. This would mostly cover the "external" case, but adds some useful features such as the ability to distribute workflows easily.
We also have a document at https://github.com/tweag/funflow/blob/master/funflow/doc/external-dsl.org which goes through some of our plans in this area.
From looking at your DSL, it seems to be approaching DOT syntax? E.g. you're defining the graph explicitly, without giving bindings to the outputs between steps (approximating point-free style) and not requiring any ordering between the edges. I think it would certainly be plausible to translate this to funflow (the point-free style is basically composition with >>>, so would just require an ordering). I'd be interested in the translation of complex flows in that format, however. I think one of the complex aspects is often when we have flows where the outputs aren't quite matched to the inputs of subsequent steps - this is much easier to negotiate if we can explicitly bind the outputs to manipulate.
It's good to have a DSL for "external" case. But I would like to have a DSL for internal case as well.
I think both SciFlow and funflow define the graph explicitly. But the approaches are different.
In both funflow and SciFlow, we define "connections", e.g., a computation step and its dependent steps(or input).
In funflow, this is done by:
node1 = step fun1
node2 = step fun2
node1 >>> node2
In SciFlow, the syntax is similar:
node "Node1_Id" 'fun1
node "Node2_Id" 'fun2
["Node1_Id"] ~> "Node2_Id"
But the things become different when we have a large number of such "connections". In funflow, programers have to assemble these "connections" by hand to build the computational graph (or workflow). On the contrary, in SciFlow, you don't assemble connections. You only need to define these connections. The assembly is done by the program. This has two advantages:
- Defining complex workflow becomes super easy. One just need to think about individual steps.
- The assembly is deplayed in SciFlow, which means you can export the "builder" (graph specification) and other library can reuse the "builder" and add connections from the "builder" to their own graph. I don't know how to do this in funflow, because once the indvidual steps are assembled into a single
SimpleFlow a b, the internal steps are no long visible and cannot be reused.
I tried to rewrite my pipelines using funflow, but the two points I mentioned above prevent the migration. If you think this is something that funflow should have, I may be able to help.
PS: the edges have ordering in SciFlow. For example:
fun1 :: () -> Int
fun2 :: () -> String
fun3 :: (Int, String) -> Int
-- This will compile
builder :: Builder ()
node "F1" 'fun1
node "F2" 'fun2
node "F3" 'fun3
["F1", "F2"] ~> "F3"
-- This will NOT compile
builder :: Builder ()
node "F1" 'fun1
node "F2" 'fun2
node "F3" 'fun3
["F2", "F3"] ~> "F3"