containers Improve powerset

Jan 22 '23 12:01 jwaldmann

049633748d8dc90e16321ea3a605a4fd08b7f699


  powerSet (19):             OK (0.91s)
    125  ms ±  10 ms
  powerSet (20):             OK (0.77s)
    255  ms ±  11 ms
  powerSet (21):             OK (8.33s)
    547  ms ±  51 ms
  member.powerSet (16):      OK (1.53s)
    96.7 ms ± 994 μs
  member.powerSet (17):      OK (3.33s)
    210  ms ± 3.8 ms
  member.powerSet (18):      OK (3.24s)
    443  ms ±  16 ms

daf6133f73c5d21ea01342405fa10ef33f056b36 (with vector instead of array, and the following is using -O2)

cabal bench set-benchmarks -O2

  powerSet (19):             OK (2.99s)
    92.5 ms ± 2.8 ms
  powerSet (20):             OK (3.18s)
    207  ms ±  10 ms
  powerSet (21):             OK (6.83s)
    437  ms ±  20 ms
  member.powerSet (16):      OK (0.67s)
    91.9 ms ± 2.9 ms
  member.powerSet (17):      OK (1.50s)
    205  ms ±  10 ms
  member.powerSet (18):      OK (3.21s)
    444  ms ± 8.8 ms

[EDIT] but I guess that containers does not want to use vector (just for this powerSet function) so this is more proof-of-concept than a real suggestion.

Jan 23 '23 10:01 jwaldmann

No, we don't want to use vector. We could pull in some bits of primitive if we need to, but I'd rather get those bits shifted into base.

Jan 30 '23 21:01 treeowl

Do you want to beat this into mergeable shape by tomorrow so it can go in the next release? I would love to reduce the memory requirements ASAP.

Jan 31 '23 03:01 treeowl

no, not until tomorrow. I would not want to introduce (and then maintain) this (my) mess of code. The original implementation is an obviously correct one-liner, and uses just twice the space/time (for realistic argument sizes)? I would not care much about that overhead until a real user of powerset appears.

Perhaps add a haddock instead: "if you want to use this function for NFA to DFA conversion: don't. Instead, use Dragon Book (2nd ed.) Fig. 3.32 (subset construction), and represent set of states as IntSet."

But really, no, just leave it as is.

Perhaps the value of this MR is that it provides a test case where Vector seems better than Array, and inlining (to make non-allocating loops) is hard to predict. But that's not actually news?

Jan 31 '23 15:01 jwaldmann