dhash-vips
dhash-vips copied to clipboard
Addressing compiling problems
Hello, I saw a bug https://bugs.ruby-lang.org/issues/17174 you posted a some while ago and tried to find out what is happening. Looks like it was because you've used internal BDIGIT api, which was removed after Bignum and Fixnum were merged into Integer in ruby 2.4(https://www.ruby-lang.org/en/news/2016/12/25/ruby-2-4-0-released/), after which Bignum implementation was hidden(https://github.com/ruby/ruby/commit/841bf2b2081a394cc3eb846734157f851966476e)
But ruby-slim was packed with additional headers for some time:
root@haupc:/usr/local/include# grep -rn 'BDIGIT' .
./ruby-2.7.0/x86_64-linux/rb_mjit_min_header-2.7.2.h:19130:#define BDIGIT unsigned int
/usr/local/include/ruby-2.7.0# grep -rn 'rb_int_pow' .
./x86_64-linux/rb_mjit_min_header-2.7.2.h:5566:VALUE rb_int_powm(int const argc, VALUE * const argv, VALUE const num);
./x86_64-linux/rb_mjit_min_header-2.7.2.h:5760:VALUE rb_int_pow(VALUE x, VALUE y);
Though it's not the case now(e.g. in ruby 3.3.3)
/usr/local/include# grep -rn 'BDIGIT' .
./ruby-3.3.0/ruby/internal/intern/bignum.h:49: * `BDIGIT` but its definition is hidden.
/usr/local/include# grep -rn 'rb_int_pow' .
So to fix the compilation there are the ways:
- Use public ruby C API (those inside include folder)
- Compile somehow together with ruby to have access to all internals
- Copy part of definitions to own repository(like https://github.com/alexdowad/bit-twiddle/blob/master/ext/bit_twiddle/ruby31/bt_bignum.h)
3 sounds safer than 2, but there is still a cost of maintenance to reflect changes in ruby files(and there is a possibility ruby will change Bignum internal implementation)
There is also a feature request in Ruby to add popcount
method to Integer, which would speed up pure Ruby solution if accepted.
I made an example version of public Ruby API approach in a PR, which slower than BDIGIT approach, but still much faster than pure Ruby
user system total real
distance3_bdigit 0.198673 0.000000 0.198673 (0.198672)
distance3_public 0.373779 0.000000 0.373779 (0.373777)
distance3_ruby 1.824285 0.000000 1.824285 (1.824315)
I've tried several other approaches e.g. with making only popcount in C, and this is the fastest without much trouble with compilations. You can see the benchmark here https://github.com/haukot/dhash-vips/blob/compare_methods/idhash.c#L11