csvcut/csvgrep: Allow for numeric column names and column names containing hyphen (`--column-names`, `--column-indexes`)
I have data with a column names like
pre-k,k,01,02,03,04,05
where the numbers refer to grades. I can't select the 3rd grade column using csvcut
csvcut -c "k,03" returns the kindergarten row and the 2nd grade row (the fourth column)
@onyxfish would you accept a PR that changed the behavior so that if the number was quoted it would interpret it as a string.
Can you give an example of what the command would look like?
I like to have a solution for this. (I've seen plenty of spreadsheets with year columns.) However, I don't want to rely on quote characters on the command line, as they have a particular meaning that may vary from platform to platform.
echo "pre-k,k,01,02,03,04,05" | csvcut -c "k,03"
Expected:
k,03
Actual:
k,01
What if we added explicit --column-names and --column-indices options to all tools (cut, grep, join, sort, stat) that accept --columns (-c) to give users an opportunity to be unambiguous?
@jpmckinney , what is the status of the proposed --column-names/--column-indices options? This would solve another ambiguity: csvcut cannot handle column names that contain dashes, as in "age-mean" because it thinks the argument is a range (even for quoted column names); the error is the following:
Invalid range %s. Ranges must be two integers separated by a - or : character.
With --column-names there would be no ambiguity because csvcut would not even try to parse ranges.
I've increased the issue's priority, so that I notice it next time I do a round of maintenance – but I can't make any promises about how soon I'll implement it. I accept pull requests, though!