scikit-learn-extra Fastfood is slow

Fastfood is somewhat slower than RBFSampler (while in theory it should be faster). Google Colab timings (sklearn 0.24.2, sklearn_extra 0.2.0, numpy 1.19.5) ka_timings_gc Laptop timings (Ubuntu 20.04, Intel 8300H, 32GB RAM) (sklearn 0.23.2, sklearn_extra 0.2.0, numpy 1.19.2) ka-timings

Sample code

!pip install scikit-learn-extra
import time
import numpy as np
from matplotlib import pyplot as plt
from sklearn.kernel_approximation import RBFSampler
from sklearn_extra.kernel_approximation import Fastfood

X = np.random.normal(size=(100_000, 100))
n_components = [64, 128, 256, 512, 1024, 2048]
timings_ff = []
timings_rbfs = []

for n in n_components:
  ff = Fastfood(n_components=n, random_state=1)
  tic = time.perf_counter()
  X_t = ff.fit_transform(X)
  tac = time.perf_counter()
  timings_ff.append(tac - tic)

for n in n_components:
  rbfs = RBFSampler(n_components=n, random_state=1)
  tic = time.perf_counter()
  X_t = rbfs.fit_transform(X)
  tac = time.perf_counter()
  timings_rbfs.append(tac - tic)

plt.plot(n_components, timings_ff);
plt.plot(n_components, timings_rbfs);

Changing tradeoff_mem_accuracy to 'mem' did not affect speed.

May 24 '21 16:05 glevv

bottleneck seems to be _phi function

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
       2    0.289    0.144    0.289    0.144 _fastfood.py:104(_approx_fourier_transformation_multi_dim)
       1    0.000    0.000    0.000    0.000 _fastfood.py:108(_l2norm_along_axis1)
       1    0.000    0.000    0.000    0.000 _fastfood.py:112(_uniform_vector)
       1    0.045    0.045    0.389    0.389 _fastfood.py:118(_apply_approximate_gaussian_matrix)
       1    0.036    0.036    0.036    0.036 _fastfood.py:136(_scale_transformed_data)
       1    0.441    0.441    0.441    0.441 _fastfood.py:144(_phi)
       1    0.000    0.000    0.007    0.007 _fastfood.py:153(fit)
       1    0.000    0.000    0.000    0.000 _fastfood.py:190(<listcomp>)
       1    0.000    0.000    0.897    0.897 _fastfood.py:207(transform)
       1    0.000    0.000    0.000    0.000 _fastfood.py:72(_is_number_power_of_two)
       1    0.000    0.000    0.000    0.000 _fastfood.py:76(_enforce_dimensionality_constraints)
       1    0.000    0.000    0.024    0.024 _fastfood.py:89(_pad_with_zeros)

numpy trigonometric functions (cos, sin) are slow (they take the same amount of time as _approx_fourier_transformation_multi_dim individually), numba does not help. Maybe pure Cython implementation of _phi function would be faster.

Jun 02 '21 16:06 glevv

Thanks for the investigation @GLevV !

There are two issues there,

first is that _phi (and therefore transform) returns (n_samples, 2*n_components) instead of (n_samples, n_components) for tradeoff_mem_accuracy == "accuracy". That is a bug. Also because it manipulates a 2X larger matrix that it should, it's therefore 2X slower. We should really add a unit test to detect this.
Generally the function _phi should be modified to use more inplace operations (https://stackoverflow.com/a/22952411/1791279), though computing cos and sin would likely be expensive in any case.

A PR to fix/improve that situation would be very welcome

Jun 02 '21 17:06 rth

Judging by this implementation it should return matrix NxN and should be used with SVM(kernel='precomputed').

On the other hand in the same repo, they are using [cos(X), sin(X)] matrix as a feature matrix for the next model (and it will be obviously 2*n_components). Straightforward solution would be to divide n_components by 2 (or decrease the power of 2 by 1) for "accuracy" case.

Jun 02 '21 17:06 glevv

[Cos, Sin] matrix is a matrix of real and imaginary components of Fourier transform. In these cases sometimes only real part of transformation is taken and imaginary part is dropped, but then there will be no difference between 'accuracy' and 'mem' cases.

We can either warn users that in accuracy case feature space will be 2 times higher, or we can drop 'accuracy' case for now and just leave random kitchen sinks method.

Jun 17 '21 08:06 glevv

I don't think "[Cos, Sin] matrix is a matrix of real and imaginary components of Fourier transform." is true, if that were the case we could have used fft and we would have obtained a faster implementation but in fact here we don't use fourier transform or am I wrong ?

Jun 17 '21 09:06 TimotheeMathieu

Well, I don't know particular implementation, but in paper "Fastfood: Approximate Kernel Expansions in Loglinear Time" [1] they stated Screenshot 2021-06-17 at 10-10-31 1408 3060 pdf Which is the same as [cos(X), sin(X)] / sqrt(n), so I assumed that cos(X) is real part and sin(X) is imaginary part, but I could be wrong.

Jun 17 '21 10:06 glevv

Yes this is not a Fourier transform. Fourier transform involves a sum (or an integral) of exponential, and here we just have an exponential but not a sum. But you are right that cos(X) and sin(X) are the real and imaginary part of the feature map phi_j(x). And I think the "mem" case is exactly as you described : we drop the imaginary part.

Jun 17 '21 10:06 TimotheeMathieu

Well, I guess the only thing left is to replace numpy math operations with python ones where needed, sice python operations are faster with single numbers. Like these https://github.com/scikit-learn-contrib/scikit-learn-extra/blob/c1dfd25aa76e1d11655329ad04d3a263be8ca7eb/sklearn_extra/kernel_approximation/_fastfood.py#L80 https://github.com/scikit-learn-contrib/scikit-learn-extra/blob/c1dfd25aa76e1d11655329ad04d3a263be8ca7eb/sklearn_extra/kernel_approximation/_fastfood.py#L141 https://github.com/scikit-learn-contrib/scikit-learn-extra/blob/c1dfd25aa76e1d11655329ad04d3a263be8ca7eb/sklearn_extra/kernel_approximation/_fastfood.py#L146 https://github.com/scikit-learn-contrib/scikit-learn-extra/blob/c1dfd25aa76e1d11655329ad04d3a263be8ca7eb/sklearn_extra/kernel_approximation/_fastfood.py#L151

And maybe change default tradeoff to 'mem', since it is be more consistent and fast.

Jun 18 '21 07:06 glevv

scikit-learn-extra scikit-learn-extra copied to clipboard

Fastfood is slow

scikit-learn-extra
scikit-learn-extra copied to clipboard