miller
miller copied to clipboard
[feature request] a "transpose" verb
Hello @johnkerl
One feature I would like to see in miller in the future is a way to easily transpose rows and columns, much like this https://bioinf.shenwei.me/csvtk/usage/#transpose
Does it make sense to you?
@trantor yes it makes sense ... mostly.
One key thing about Miller is that it indexes by unique names so
a,b,c
1,2,3
4,5,6
4,8,9
would transpose to
a 1 4 4
b 2 5 8
c 3 6 9
which has duplicate column 4 which would become
a,1,4
b,2,8
c,3,9
So there are a couple options I can see:
- Do just this, from CSV to CSV, with the caveat that any duplicate values in the first column of the input will result in dropped columns in the output;
- Read CSV and emit NIDX format -- this wouldn't work quite right unless
mlr --icsv --onidx --ofs commawas used
@trantor here is a DSL proof of concept:
$ cat example.csv
color,shape,flag,k,index,quantity,rate
yellow,triangle,true,1,11,43.6498,9.8870
red,square,true,2,15,79.2778,0.0130
red,circle,true,3,16,13.8103,2.9010
red,square,false,4,48,77.5542,7.4670
purple,triangle,false,5,51,81.2290,8.5910
red,square,false,6,64,77.1991,9.5310
purple,triangle,false,7,65,80.1405,5.8240
yellow,circle,true,8,73,63.9785,4.2370
yellow,circle,true,9,87,63.5058,8.3350
purple,square,false,10,91,72.3735,8.2430
$ cat xpose.mlr
@inrecs[NR] = $*;
@inr = NR;
@inf = NF;
end {
onr = @inf;
onf = @inr;
for (oir = 1; oir <= onr; oir += 1) {
outrec = {};
for (oif = 1; oif <= onf; oif += 1) {
outrec[oif] = @inrecs[oif][[[oir]]];
}
emit outrec;
}
}
# a,b,c
# 1,2,3
# 1,5,6
# 1,1
# 2,5
# 3,6
$ mlr --icsv --onidx --ofs , put -q -f xpose.mlr example.csv
yellow,red,red,red,purple,red,purple,yellow,yellow,purple
triangle,square,circle,square,triangle,square,triangle,circle,circle,square
true,true,true,false,false,false,false,true,true,false
1,2,3,4,5,6,7,8,9,10
11,15,16,48,51,64,65,73,87,91
43.6498,79.2778,13.8103,77.5542,81.2290,77.1991,80.1405,63.9785,63.5058,72.3735
9.8870,0.0130,2.9010,7.4670,8.5910,9.5310,5.8240,4.2370,8.3350,8.2430
I can easily enough code this up in Go and make it into a
mlr --icsv --onidx --ofs , transpose example.csv
with the same output.
See also https://github.com/johnkerl/miller/issues/321