Add primitive for bit_length()
All CPUs we want to support seem to have a fast way of counting the number of leading zero bits or the index of the highest set bit, which we could make available through int.bit_length(). This would nicely complement the set of supported bitwise operations.
This is a very specialized operation, but we can potentially make it a lot faster (more than 10x), and it's useful for certain algorithms and data structures, such as bitsets.
The relevant x86-64 operations are bsr and lzcnt, but the latter doesn't seem to be supported on some fairly recent Intel CPUs. We can use library functions in the generated C that map to these instructions.
Relevant wikipedia article: https://en.wikipedia.org/wiki/Find_first_set (also contains information about how to use these from C)
(This is low priority unless this can speed up the dataflow analysis in mypyc, in which case we might want to implement it since it could be an easy win.)
Hi, I would like to work on this issue
@Prototype4988 Sounds good!
Hello, since it's not solved yet, can I try to take on this task? My question are:
- It's seems like that the ISA only provide bit_length related instruction for unsigned int/long/long long while CPyTagged is size_t. So can we make assumption that
sizeof(size_t) == 4 && sizeof(int) == 4for 32 bit platform andsizeof(size_t) == 8 && sizeof(long long) == 8for 64 bit platform? - Can you give some advice how to overwrite default function(I mean, when you meet "int.bit_length()", generate a code that call a special function)? Thank you!
@DunderBird
- Yes. We make these assumptions elsewhere in any case.
- Add a specializer to
mypyc.irbuild.specialize. Use the two-argument version of@specialize_function.