flux
flux copied to clipboard
SPEC: join function should be simpler
From ifql created by jsternberg : influxdata/ifql#353
The join function should be substantially simpler than it currently is.
Based on my use of the join function in the transpiler work, I find that join contains too much functionality and is very verbose for the most common use case because it seems to believe it a map function too. As an example, this is something I find strange:
val1 = from(...) |> ...
val2 = from(...) |> ...
join(tables: {val1: val1, val2: val2}, on: ..., fn: {val1: tables.val1, val2: tables.val2})
As can be seen, the function is just a copy of what I've already done when joining the tables.
Now I'm not sure the best way to do this because the function is sometimes useful. If you have an already joined stream, the function is needed to join the tables correctly with whatever names you want. But, the most common use case is probably the one above and it's a lot of typing for what should be obviously inferred. It seems to me that join and map should be separate functions so that you could instead do something like this:
val1 = from(...) |> ...
val2 = from(...) |> ...
val3 = join(tables: {val1: val1, val2: val2}, on: ...))
val4 = from(...) |> ...
join(tables: {val3: val3, val4: val4}) |> map(fn: (r) => {val1: r.val3.val1, val2: r.val3.val2, val4: r.val4})
This way, you still keep the existing mechanics and the optimizer can always combine them for efficiency, but the join command itself is optimized for joining tables and not responsible for also mapping them.