csvtk
csvtk copied to clipboard
filter2 command is too slow
Compared with filter command or awk, fiter2 command is much slower, especially for rule with multiple conditions.
It might be relative to this function in the for-loop, which repeatedly parsing the expression. https://github.com/shenwei356/csvtk/blob/9407f73e2d72dddf5042c7dbb6299a180ea9cf4a/csvtk/cmd/filter2.go#L370-L376
Yes, I noticed that. It is slow :(
Can we move the Expression parsing function outside the for-loop and run it only once?
It is slow, but it must be done like that. Cause filterStr1 is different in each iteration.
Why filterStr1 is different? Can we cache the parsed results?
It's the expression, like '$age > 18', the $age needs to be replaced with the value of each row.
Yes. I mean can we parsed the expression as something like '$1>18' and reuse the code of the filter command to deal with the computation afterward
parsed the expression as something like '$1>18' and reuse the code of the filter command I don't think so.
God, it's really slow~ I used it a lot recently. Have to improve it, when I have time ~