filo icon indicating copy to clipboard operation
filo copied to clipboard

Suggestion for new groupBy output format

Open hvbakel opened this issue 14 years ago • 1 comments

Currently, groupBy has the option to select 'freqdesc' or 'freqasc' as output formats, which outputs the frequency of each value in the opCol as comma-delimited value:frequency pairs. It would be nice to have a third 'freqtab' format, which would produce a table-like format with one column per value, and counts in rows. For example, the following 'freqdesc' format:

groupA male:10,female:20 groupB female:10

would instead be outputted as: female male groupA 20 10 groupB 10 0

hvbakel avatar Sep 15 '11 20:09 hvbakel

Interesting idea. The main challenge here is that in order to make each line have the same number of columns, one must do a full scan of the input to collect all possible values. Let me give this some thought. In the interim, you could use awk to split the grouped column by "," and ":" to emulate this output.

arq5x avatar Sep 20 '11 12:09 arq5x