fast-python
fast-python copied to clipboard
Source code for Fast Python (2020) by Chris Conlan
Fast Python
Source code for Fast Python (2020) by Chris Conlan
Paperback available for purchase on Amazon.
Code profiles
The following code profiles can be run as stand-alone scripts. They may or may not depend on explanation provided in the accompanying book.
- Binary search: binary_search.py
- Dictionary construction: build_dict.py
- Concatenating strings, string construction: concatenate_strings.py
- Counting the frequency of a value: count_occurrences.py
- Computing a cumulative sum: cumulative_sum.py
- The
in
operator and early stopping: early_stopping.py - Time series filters/convolutions: filters.py
- Find largest
k
values in a list: find_top_k.py - List construction/declaration/flattening: flatten_lists.py
- Counting lines in a file: line_count.py
- Set intersection, finding matches in a list: match_within.py
- Matrix multiplication: matrix_multiplication.py
- Computing moving averages: moving_averages.py
- Counting frequency of a word in text: occurrences_of.py
- Looping through
pd.DataFrame
objects: pandas_loops.py - Sorting algorithms: sorting.py
- Low-level sorting algorithms: sorting_v2.py
- Adding a list of numbers: sum.py
Running them is simple ...
cd fast-python/src
python cumulative_sum.py
Profiling
All the profiles use a simple profiling module in src/utils/profiler.py. It produces tables and charts like the following.
np_fast_cusum
n = 56234132 values
t = 201.806 ms
n/t = 278653.8114 values per ms
np_fast_cusum
n = 100000000 values
t = 350.611 ms
n/t = 285216.7553 values per ms
...
function n_values t_milliseconds values_per_ms
0 slow_cusum 1 0.012 85.0196
1 slow_cusum 3 0.005 640.7530
...
14 slow_cusum 5623 1298.218 4.3313
15 slow_cusum 10000 4140.327 2.4153
...
30 slow_cusum_expanded 5623 1878.419 2.9935
31 slow_cusum_expanded 10000 5767.316 1.7339
...
62 python_fast_cusum 56234132 5727.162 9818.8478
63 python_fast_cusum 100000000 10939.993 9140.7733
...
94 pandas_fast_cusum 56234132 442.652 127039.2437
95 pandas_fast_cusum 100000000 780.461 128129.3962
...
126 numba_fast_cusum 56234132 139.602 402816.3295
127 numba_fast_cusum 100000000 236.445 422930.9936
...
158 np_fast_cusum 56234132 201.806 278653.8114
159 np_fast_cusum 100000000 350.611 285216.7553
I use the profiler frequently in my own work. It allows me to analyze the relationship between computational complexity and raw execution time pretty easily.
Dependencies
I have included a dependencies.txt
, but you should be fine with a blank Python 3 environment followed by ...
pip install numpy pandas numba joblib matplotlib pillow