deep_learning_and_the_game_of_go
deep_learning_and_the_game_of_go copied to clipboard
implement: vectorized sigmoid, sigmoid_prime
Hi guys, loving the new book. Great job.
The following change increases the performance of typical calls to sigmoid and sigmoid_prime by roughly 50x.
Total performance impact on dlgo/nn/run_network.py is around 100% improvement.
The reason is roughly that np.vectorize just coerces types to be able to call a non-numpy (scalar) function. It doesn't smartly compile it to ufuncs or anything.
Docs:
https://docs.scipy.org/doc/numpy/reference/generated/numpy.vectorize.html
. . .The vectorize function is provided primarily for convenience, not for performance. The implementation is essentially a for loop.
Below are a few benchmarks of variations on the sigmoid functions to illustrate the phenomena. Tested with jupyter's "%timeit" on a 2015 Macbook Pro.
def sigmoid_double(x):
return 1.0 / (1.0 + np.exp(-x))
def sigmoid(z):
return np.vectorize(sigmoid_double)(z)
def sigmoid2(z):
return np.reciprocal(np.add(1.0, np.exp(-z)))
def sigmoid_prime_double(x):
return sigmoid_double(x) * (1 - sigmoid_double(x))
def sigmoid_prime(z):
return np.vectorize(sigmoid_prime_double)(z)
def sigmoid_prime2(z):
return sigmoid(z)*(1-sigmoid(z))
def sigmoid_prime3(z):
return np.multiply(sigmoid2(z),np.subtract(1,sigmoid2(z)))
testMini = np.random.random(1000)
test = np.random.random(10000)
%timeit sigmoid(testMini)
%timeit sigmoid(test)
%timeit sigmoid2(test)
%timeit sigmoid_prime(test)
%timeit sigmoid_prime2(test)
%timeit sigmoid_prime3(test)
1.5 ms ± 86.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
14.4 ms ± 759 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
67.8 µs ± 6.01 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
40.9 ms ± 8.1 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
30.8 ms ± 5.86 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
146 µs ± 5.48 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
@jeffhgs thanks for your addition. Actually, we're completely aware of this. Two things:
a) Chapter 5 is "optimized" for readability, aimed at beginners. I hope you see how your performance bump is a little less readable for someone getting started. b) The master branch should stay in sync with the book. Exceptions: clarifications and bug fixes / errata.
Having said that, I don't want your work to go to waste. I'm thinking about having an improvements
branch people can open PRs against. Would that work for you? I'll put this on the readme as well. :+1:
@jeffhgs new improvements
branch in now live.
Reading:
Note for contributors: To ensure the book stays in sync, consider requesting changes and submitting pull requests against the improvements branch, instead of master (which we keep reserved for bug fixes etc.).
Sure, a branching policy to preserve book sync totally seems appropriate.