hexhamming icon indicating copy to clipboard operation
hexhamming copied to clipboard

Free GIL when running check_bytes_arrays_within_dist

Open RandomNameUser opened this issue 1 year ago • 3 comments

The check_bytes_arrays_within_dist function has the potential to take a long time when searching in a large array. It would be helpful if the lib could free the GIL to let other threads run while that is going on.

Never having written a C Python extension, if I interpret the docs (https://docs.python.org/3/c-api/init.html#c.Py_BEGIN_ALLOW_THREADS) right, all it would take to do that is to add Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS macros and remove the in-loop return, i.e. change the loop to:

    int return_value = -1;

    Py_BEGIN_ALLOW_THREADS

    int res;
    uint64_t number_of_elements = big_array_size / small_array_size;
    uint8_t* pBig = big_array;
    for (uint64_t i = 0; i < number_of_elements; i++, pBig += small_array_size) {
        res = (int)ptr__hamming_distance_bytes(pBig, small_array, small_array_size, max_dist);
        if (res == 1)
            return_value = i;
            break;
        }

    Py_END_ALLOW_THREADS

   return Py_BuildValue("i", return_value);

I don't have a good way to test this right now, but it seems trivial enough. I would appreciate if you could give this a try.

Thanks for a greatly useful little library. Stuff like this is why I love Python! ;)

RandomNameUser avatar Jan 29 '23 12:01 RandomNameUser