More optimal BLC sorting (Merge sort)
Hi and thanks for BLC, I'm having a lot of fun with it! See this RosettaCode entry for one of the experiments in optimizing sort.lam. Some highlights:
- Mergesort used instead of Insertion sort, so the algorithm is much simpler, more idiomatic, and takes much less bits
-
fixpoint-combinatoruses internalinnerfunction, saving some 15 bits -
consis locally bound, instead of as global, saving some brujin indices -
nilandt, as compiled from Lisp, are mere000010and0000110respectively, and binding them made no sense, so they are inlined by the compiler
The code in lists/sort.lam could be tested with
$ echo -n abracadabra | ../blc run8 sort.lam aaaaabbcdrr or with only the blc code: $ (./uni deflate < bin/sort.blc; echo -n abracadabra) | ./uni Opening file /Users/tromp/AIT/bin/deflate.blc8 steps 25425 time 0ms steps/s 666M #GC 0 HP 48204 aaaaabbcdrrsteps 8107 time 0ms steps/s 666M #GC 0 HP 16780
Your code is not working in the same way, as $ (./uni deflate < bin/sort0.blc; echo -n abracadabra) | ./uni ends up in some infinite loop.
How can I run your code on the same input?
I'm using my own toolkit, cl-blc, compilable with make blc in the project dir only having sbcl Lisp installed (see full instructions in the README.) With it, things can be ran as
./blc run example/sort.blc string -- --
abracadabra
^D
aaaaabbcdrr
Note though that you'd need a modified version of the program that includes a comparison predicate:
01000100010001000100010000010110011111000000000010101111111100101110111101011111101110000010111000000001000101111100111111000000000010101011111111011100110000011001011111101110010111011110100101111110011000001100101111110111011001011011000001000000001011011101100001000110100001110011010000001000101100000001000001100101111011010000001011011101100000000101011110000001100111011110001100010
;; Compiled from
(let ((pred (lambda (n f x)
(n
(lambda (g h)
(h (g f)))
(lambda (u) x)
(lambda (u) u))))
(sub (lambda (m n)
(n pred m)))
(leq (lambda (m n)
((lambda (n)
(n (lambda (x) nil) t))
(sub m n))))
(fixpoint-combinator
(lambda (g)
(let ((inner (lambda (x) (g (x x)))))
(inner inner)))
#+nil
(lambda (f)
(let ((inner (lambda (x)
(f (lambda (y)
(x x y))))))
(inner inner))))
(sortmerge (lambda (list pred x)
(let ((cons (lambda (car cdr f)
(f car cdr))))
(list
(fixpoint-combinator
(lambda (recur head tail x)
((pred head (x t))
(cons head (tail recur x))
(cons (x t) (cons head tail)))))
(cons x nil))))))
((lambda (pred list)
(list
(fixpoint-combinator
(lambda (recur head tail z)
(sortmerge (tail recur z) pred head)))
nil))
leq))
It's 389 bits now—or 397 if we take an applicative version—which is still less than 436, but not by a large margin. This one should also work with your interpreter (that I'm unable to compile yet.)
So yes, my claimed 269 bits are those of an implementation expecting both predicate and data, while you implementation seems to have predicate hard-coded in it. I didn't know that coming here, so thanks for making me dig deeper in it!
I still cannot get that to work:
$ ls -l bin/sort* -rw-r--r-- 1 tromp staff 436 Feb 25 2024 bin/sort.blc -rw-r--r-- 1 tromp staff 55 Feb 25 2024 bin/sort.blc8 -rw-r--r-- 1 tromp staff 389 Nov 21 10:31 bin/sort1.blc -rw-r--r-- 1 tromp staff 49 Nov 21 10:32 bin/sort1.blc8
$ (cat bin/sort.blc8; echo -n abracadabra) | ./uni aaaaabbcdrrsteps 8107 time 0ms steps/s 666M #GC 0 HP 16780
$ (cat bin/sort1.blc8; echo -n abracadabra) | ./uni steps 1533 time 0ms steps/s 666M #GC 0 HP 4372
While it finishes, there is no output.
I still cannot get that to work:
$ ls -l bin/sort1* -rw-r--r-- 1 tromp staff 389 Nov 21 10:31 bin/sort1.blc -rw-r--r-- 1 tromp staff 49 Nov 21 10:32 bin/sort1.blc8
$ (cat bin/sort1.blc8; echo -n abracadabra) | ./uni steps 1533 time 0ms steps/s 666M #GC 0 HP 4372
While it finishes, there is no output.
What does uni imply about the program input? My implementation implies that it gets input as a cons list of integers. Is that so for yours?
Down to 341 bits thanks to cons inlining:
01000100010001000100000101100111110000000000101011111111001011101111010111111011100000101110000000010111100111110000000000101010111111101110011000001100001011011110010111101111101100001011001110000011000010110111110111100001011011000001000010001101000011100110100000010101011011101100000001000001100000000101011110000001100111011110001100010
Compiled from
(let ((pred (lambda (n f x)
(n
(lambda (g h)
(h (g f)))
(lambda (u) x)
(lambda (u) u))))
(leq (lambda (m n)
((n pred m) (lambda (x) nil) t)))
(fixpoint-combinator
(lambda (g)
(let ((inner (lambda (x) (g (x x)))))
(inner inner)))
#+nil
(lambda (f)
(let ((inner (lambda (x)
(f (lambda (y)
(x x y))))))
(inner inner))))
(sortmerge (lambda (list pred x)
(list
(fixpoint-combinator
(lambda (recur head tail x)
((pred head (x t))
;; these (lambda (p) ...) are inlined cons calls
(lambda (p) (p head (tail recur x)))
(lambda (p) (p (x t) (lambda (p) (p head tail)))))))
(lambda (p) (p x nil))))))
((lambda (pred list)
(list
(fixpoint-combinator
(lambda (recur head tail z)
(sortmerge (tail recur z) pred head)))
nil))
leq))
Still doesn't work:
$ (cat bin/sort0.blc8; echo -n abracadabra) | ./uni steps 1306 time 0ms steps/s 666M #GC 0 HP 3728
What does uni imply about the program input?
I run uni in the default byte mode [1] [2], where input is a list of (length 8) lists of bits (bools).
My implementation implies that it gets input as a cons list of integers. Is that so for yours?
No. What do you mean by integers? Church numerals?
[1] https://www.ioccc.org/2012/tromp/ [2] https://gist.github.com/tromp/86b3184f852f65bfb814e3ab0987d861
What does uni imply about the program input?
I run uni in the default byte mode [1] [2], where input is a list of (length 8) lists of bits (bools).
My implementation implies that it gets input as a cons list of integers. Is that so for yours?
No. What do you mean by integers? Church numerals?
Oh, now I see! My implementation runs on input as list of Church numerals/ASCII bytes, while yours runs on input as list of lists of bools/bits. That’s the problem. Is there maybe a full character/byte mode we can test it on?
The generic implementation I listed initially can be adapted to handle lists of bools, but that’s quite an unfamiliar structure to me, so I can’t guarantee code golfing.
Is there maybe a full character/byte mode we can test it on?
All my tools use the BLC IO model of list of bools (for bit mode) or list of lists of bools (for byte mode), so they cannot directly test Church numerals for bytes. You'd have to write a converter from a BLC input stream to a stream of Church numerals to make it work with my tools.
Hello again John,
I went ahead and parsed the byte-based format (i.e. list of 8-lists of bools) and injected it into my algo. I got this 418 bit monstrocity:
000100010001000100010111111001110000000000101011111100101110111101011111110111000001000000001011110011111000000000010101011111110111001100000110000101101111001011110111110110000101100111000001100001011011111011110000101101100000100001000110100001110011010011000000001011100000000100010101011111010010110000010000011001011111111011100111111000001010011111000001100000110000100011010000111000000101011110111011010
Compiled from
(lambda (list) ;; Applicative Z combinator for 2 args. ;; Can be easily replaced with Y2 for normal/lazy systems, but I’m not using one (let ((z2-combinator (lambda (f) (let ((inner (lambda (x) (f (lambda (y z) (x x y z)))))) (inner inner)))) (tromp< (z2-combinator (lambda (recur a b) (a (lambda (a-head a-tail i) (let ((b-head (b t))) ;; xnor a-head b-head ((a-head b-head (b-head nil t)) (recur a-tail (b nil)) b-head))) t)))) ;; Uncommented: Y-combinator for normal/lazy system ;; Commented out (#+nil) below: Z combinator (fixpoint-combinator (lambda (g) (let ((inner (lambda (x) (g (x x))))) (inner inner))) #+nil (lambda (f) (let ((inner (lambda (x) (f (lambda (y) (x x y)))))) (inner inner)))) (sortmerge (lambda (list pred x) (list (fixpoint-combinator (lambda (recur head tail x) ((pred head (x t)) ;; these (lambda (p) ...) are inlined cons calls (lambda (p) (p head (tail recur x))) (lambda (p) (p (x t) (lambda (p) (p head tail))))))) (lambda (p) (p x nil)))))) (list (fixpoint-combinator (lambda (recur head tail z) (sortmerge (tail recur z) tromp< head))) nil)))
Bit count can be brought down to 400 by using different recursion combinators (Y and Y2 instead of Z and Z2 respectively) and some inlining/reordering, but that’ll do for now
Now I’m encountering a different problem:
"tromp" is an alias for ./uni
(cat example/sort.blc ; echo -n abracadabra) | tromp error: program not a closed term
Any idea for why this perfectly closed term might error out?
Finally, that one works:
$ (cat bin/sort1.blc8; echo -n abracadabra) | ./uni rrdcbbaaaaa steps 14289 time 0ms steps/s 666M #GC 0 HP 24128
Although it sorts in descending order. Compare with: $ (cat bin/sort.blc8; echo -n abracadabra) | ./uni aaaaabbcdrr steps 8107 time 0ms steps/s 666M #GC 0 HP 16780
And I count 411 bits, not 418:-)
Any idea for why this perfectly closed term might error out?
You're running ./uni in its default byte mode, where it expects the program to be in blc8 rather than blc format. I prepared sort1.blc8 by running
$ ./uni deflate < bin/sort1.blc > bin/sort1.blc8 $ ls -al bin/sort1* -rw-r--r-- 1 tromp staff 411 Nov 24 12:30 bin/sort1.blc -rw-r--r-- 1 tromp staff 52 Nov 24 12:31 bin/sort1.blc8
That assumes you have -rw-r--r-- 1 tromp staff 34 Dec 1 2024 bin/deflate.blc8 from this repo.
Although it sorts in descending order. Compare with:
This is peculiar—my implementation sorts it all right. I’ve altered the program to invert the bit check and shortened it to 367 bits:
0001000100010001011111001111000000000010101111110010111011110101111110111000001000000001011110011111100000000001010101111111011100110000011000010110111100101111011111011000010110011100000110000101101111101111000010110110000010011000000001011100000000101010101111100000110111001011110000010000011001011111110110011111000001011100000110000100011010000111000010111011010
Compiled from
(lambda (list) (let ((fixpoint-combinator (lambda (f) (let ((inner (lambda (x) (f (x x))))) (inner inner)))) (tromp< (fixpoint-combinator (lambda (recur a b) (a (lambda (a-head a-tail i) ;; xnor b-head a-head (((b t) a-head (a-head nil t)) (recur a-tail (b nil)) a-head)) t)))) (sortmerge (lambda (list pred x) (list (fixpoint-combinator (lambda (recur head tail x) ((pred head (x t)) ;; these (lambda (p) ...) are inlined cons calls (lambda (p) (p head (tail recur x))) (lambda (p) (p (x t) (lambda (p) (p head tail))))))) (lambda (p) (p x nil)))))) (list (fixpoint-combinator (lambda (recur head tail z) (sortmerge (tail recur z) tromp< head))) nil)))
Does it work for you?
Yes, that works:
$ ls -al bin/sort0* -rw-r--r-- 1 tromp staff 367 Nov 24 20:50 bin/sort0.blc -rw-r--r-- 1 tromp staff 46 Nov 24 20:50 bin/sort0.blc8 $ (cat bin/sort0.blc8; echo -n abracadabra) | ./uni aaaaabbcdrr steps 12672 time 0ms steps/s 666M #GC 0 HP 20862
Perhaps you were confused by bit 0 being true = \x0\x1. x0 and bit 1 being false = \x0\x1. x1, which is opposite of the boolean convention of most programming languages?
Perhaps you were confused by bit 0 being true = \x0\x1. x0 and bit 1 being false = \x0\x1. x1, which is opposite of the boolean convention of most programming languages?
No, that’s not it, I’m in lambda lands for long enough to remember that. Must be something about evaluation order and lazy vs. applicative systems…
Anyway, how do you like this algorithm? Worth including/salvaging for your repository?
Anyway, how do you like this algorithm? Worth including/salvaging for your repository?
I like that it's 69 bits (nearly 16%) shorter than the existing sort.lam
But I don't want to include lisp source in my repo, so it would need to be translated to lambda calculus source like sort.lam. Do you think you could translate it to that form? Then I'd be more than happy to include it.
Here’s a shot at that. I still can’t get GHC &c to work, so it’s untested. But should be easier to iterate on than Lisp code, I assume?
-- Sort a list of binary lists lexicographically (mergesort version). -- [] < 0:* < 1:* let fixpoint = \f.let inner = \x. f (x x) in inner inner;
true = \x\y.x; false = \x\y.y; nil = false;
bytelt = fixpoint (\recur\a\b.a (\ahead\atail\z. ((b true) ahead (ahead false true)) (recur atail (b false)) ahead) true);
cons = \a\b\p.p a b;
sortmerge = \list\pred\x. list (fixpoint \recur\head\tail\x. (pred head (x true)) (cons head (tail recur x)) (cons (x true) (cons head tail))) (cons x nil); sort = \pred\list. list (fixpoint (\recur\head\tail\z. sortmerge (tail recur z) pred head)) nil; main = \io.sort bytelt io in main -- echo -n abracadabra | ./blc run8 mergesort.lam -- aaaaabbcdrr
Thanks, Artyom! Added in latest commit.
Here’s a shot at that. I still can’t get GHC &c to work, so it’s untested. But should be easier to iterate on than Lisp code, I assume?
works as advertised!
-- Sort a list of binary lists lexicographically (mergesort version). -- [] < 0:* < 1:* let fixpoint = \f.let inner = \x. f (x x) in inner inner;
this is not needed, since blc recognizes recursion in the let construct and will add the necessary fixpoint operator.
Cool, thank you! How many bits does this translation take? I think it gained some due to my inexperience with Haskell…
It was 317 bits. I went over it this morning, and managed to optimize it further to 300 bits. I also realized it's an insertion sort. See latest commit.
It was 317 bits. I went over it this morning, and managed to optimize it further to 300 bits.Oh, that’s extremely good, I didn’t expect it to be that optimizable! Is it using Omega combinator for recursion? I also realized it's an insertion sort. See latest commit.
Oh, right, mergesort and insertion sort look similar from the distance, so I confused them 😅
Is it using Omega combinator for recursion?
No, it uses the standard fixpoint operator, as you can see at https://github.com/tromp/AIT/blob/master/Lambda.lhs#L59-L64