miller icon indicating copy to clipboard operation
miller copied to clipboard

[feature request] n-way join

Open spmundi opened this issue 4 years ago • 3 comments
trafficstars

The join verb, with multiple fields on the RHS of the join does not work as I hoped. Eg. I have three files, a.csv b.csv c.csv a.csv f,fa a,1 b.csv f,fb a,2 c.csv f,fc a,3

I want to do "mlr --csv join -j f -f a.csv b.csv c.csv" and get: f,fa,fb,fc a,1,,2,3

But instead I get f,fa,fb a,1,2 f,fa,fc a,1,3 (Obviously, this is just as useful as what I desire -- no questions. Just not what I want)

Is there some way for me to get my desred output with having to have multiple mlr invocations?

Thanks

spmundi avatar Oct 25 '21 18:10 spmundi

Hi @spmundi , running

mlr --csv join -f b.csv -j f then join -f a.csv -j f c.csv

you will have

f,fa,fb,fc
a,1,2,3

aborruso avatar Oct 25 '21 21:10 aborruso

@spmundi @aborruso -- indeed, the b.csv and c.csv (or more files, or just standard input, whatever it may be) is a single concatenated right-hand stream, joined with the left-hand part from mlr join -f. So @aborruso's tip does multiple pairwise joins to synthesize an n-way join.

There isn't, however, currently a single n-way join among n input files.

johnkerl avatar Oct 26 '21 04:10 johnkerl

Ty for responses

spmundi avatar Oct 26 '21 11:10 spmundi