bel icon indicating copy to clipboard operation
bel copied to clipboard

Create a fast operative for 'accum'

Open masak opened this issue 3 years ago • 1 comments

I see #412 skipped accum.

(mac accum (var . body)
  (letu v
    `(withs (,v   nil
             ,var [push _ ,v])
       ,@body
       (rev ,v))))

The main challenge here is of course the closure [push _ ,v]. But did past-me consider creating a fastfunc dynamically, with make_fastfunc? Seems possible!


In fact, looking at this list of excuses...

The fast operatives are pretty nice for implementing sequential and branching control flow, delayed/suspended evaluation, lexical binding, and accumulating into a list. They are not good for setting non-lexical places (like set and its kin), dealing with continuations (like eif or catch), creating actual unique variables (like letu), or dynamically binding variables (like bind or atomic).

...it seems to me fast operatives could do a lot more. set (and its kin) would be a matter of interacting sanely with the interpreter's where handling; the remaining three (continuations, unique variables, and dynamic binding) also sound well within the realm of the possible.

But enough complaining about past-me's evident cowardice. "Show me the code." (And for this issue, implementing accum is enough.)

masak avatar Mar 24 '22 12:03 masak

The reason I suddenly felt like having a fast operative for accum:

> (accum add (for n 2 99 (add n)))
(2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99)

...was that I tried this code, and it took over four minutes to run:

real    4m19.596s
user    4m19.343s
sys     0m0.124s

Part of that cost is down to for, surely. I also tried to just do a for loop where I print the numbers instead of accumulating them:

real    1m44.137s
user    1m43.983s
sys     0m0.140s

So, 2m35s are due to accum, and the rest to other factors, including for.

To investigate the cost of the for loop itself, I also tried to just print a bunch of numbers after adding 1 to all of them (except the first):

real    1m5.441s
user    1m5.280s
sys     0m0.125s

Ah, so out of the 1m44s, the for loop only contributes 39s.

I also tried just printing the numbers 2..99, as an unrolled loop.

real    0m0.452s
user    0m0.311s
sys     0m0.078s

Ah, so most of that 1m5s goes to adding 1 to numbers. That reminds me why #140 still remains open, because numbers are not fast yet (even though they are certainly better than they were). It also shows up as a cause of slowness in #200.

Anyway, speeding up accum certainly won't hurt, as it's the biggest contributor of slowness.

masak avatar Mar 24 '22 13:03 masak