clustershell
clustershell copied to clipboard
RangeSet: nD folding optimization (#485)
Optimize multi-dimensional range set folding algorithm:
- get rid of heuristic when pre-expanding vectors for folding to improve reproducibility
- when merging vectors, perform O(n) "folding" passes as long as changes are detected to reduce the problem size, before attempting a final O(n^2) pass
- avoid casting string into integer when sorting (as RangeSet is now using strings natively)
Closes #485
$ time (cluset -e x[2000-2073]c[0-7]s[0-7]b[0-1] | python3 -m cProfile -s cumulative lib/ClusterShell/CLI/Nodeset.py -f)
x[2000-2073]c[0-7]s[0-7]b[0-1]
4773441 function calls (4276373 primitive calls) in 5.067 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
107/1 0.009 0.000 5.169 5.169 {built-in method exec}
1 0.000 0.000 5.169 5.169 Nodeset.py:27(<module>)
1 0.001 0.001 5.046 5.046 Nodeset.py:334(main)
1 0.000 0.000 5.045 5.045 Nodeset.py:155(nodeset)
1 0.285 0.285 5.042 5.042 Nodeset.py:44(process_stdin)
18950 0.031 0.000 3.148 0.000 RangeSet.py:884(inner)
590373/94356 0.058 0.000 2.538 0.000 {built-in method len}
9473 0.008 0.000 2.537 0.000 RangeSet.py:1147(_fold)
3 0.000 0.000 2.527 0.842 NodeSet.py:238(__len__)
3 0.000 0.000 2.527 0.842 RangeSet.py:926(__len__)
1 0.000 0.000 2.459 2.459 RangeSet.py:1180(_fold_multivariate)
18947 0.023 0.000 2.104 0.000 NodeSet.py:1508(update)
9474 0.041 0.000 1.873 0.000 NodeSet.py:1202(__init__)
1 0.199 0.199 1.827 1.827 RangeSet.py:1194(_fold_multivariate_merge)
18947 0.031 0.000 1.418 0.000 NodeSet.py:789(parse)
9472 0.064 0.000 1.375 0.000 NodeSet.py:810(parse_string)
6 0.000 0.000 1.074 0.179 RangeSet.py:1126(_sort)
6 0.083 0.014 1.074 0.179 {method 'sort' of 'list' objects}
28419 0.045 0.000 1.067 0.000 NodeSet.py:539(update)
37889 0.085 0.000 1.043 0.000 NodeSet.py:490(_add)
14876 0.057 0.000 0.991 0.000 RangeSet.py:1128(rgveckeyfunc)
74380 0.128 0.000 0.899 0.000 RangeSet.py:1136(<genexpr>)
148807 0.231 0.000 0.826 0.000 RangeSet.py:533(copy)
18944 0.047 0.000 0.770 0.000 NodeSet.py:996(_scan_string)
119008 0.154 0.000 0.765 0.000 RangeSet.py:478(__getitem__)
224583 0.507 0.000 0.737 0.000 RangeSet.py:106(__init__)
156896 0.243 0.000 0.705 0.000 RangeSet.py:262(_sorted)
1 0.015 0.015 0.632 0.632 RangeSet.py:1188(_fold_multivariate_expand)
9472 0.195 0.000 0.608 0.000 NodeSet.py:962(_scan_string_single)
75776 0.164 0.000 0.603 0.000 RangeSet.py:194(fromone)
18946 0.042 0.000 0.581 0.000 RangeSet.py:898(copy)
18946 0.020 0.000 0.500 0.000 RangeSet.py:904(<listcomp>)```
We've been running with this patch for more than 2 months. Are there any blockers to it being merged?