NotesOnCython icon indicating copy to clipboard operation
NotesOnCython copied to clipboard

Suggestion for "optimised Cython"

Open honnibal opened this issue 8 years ago • 0 comments

There are lots of ways to write Cython. I normally suggest writing the optimised function with a pure C interface, declared "nogil". The nogil declaration tells the Cython compiler there can be no Python objects within the function body. This gives you much better compiler errors, because the compiler doesn't have to guess that something could be Python trickery. Writing this way gives quite nice code:

# cython: infer_types=True

cdef double std_dev(const double arr*, size_t size) nogil:
    cdef double mean = 0.0
    for pval in arr[:size]:
        mean += pval
    mean /= size
    cdef double sum_sq = 0.
    for pval in arr[:size]:
        sum_sq += (pval-mean)**2
    return (sum_sq / size)**0.5

The only weird syntax is the for pval in arr[:size] loop. You could just as easily do:

for i in range(size):
    pval = arr[i]

But looping over the value is pretty convenient.

Incidentally working with Cython nogil functions can give nicer code than the equivalent Python. The reason is that in Python, we become so scared of the function call overhead that we're reluctant to break things up. In Cython and C this isn't true -- so we might prefer the following:

# cython: infer_types=True

cdef double std_dev(const double arr*, size_t size) nogil:
    mean = get_mean(arr, size)
    cdef double sum_sq = 0.
    for pval in arr[:size]:
        sum_sq += (pval-mean)**2
    return (sum_sq / size)**0.5

cdef double get_mean(const double arr*, size_t size) nogil:
    cdef double mean = 0.0
    for pval in arr[:size]:
        mean += pval
    return mean / size

honnibal avatar Jul 07 '17 10:07 honnibal