Apriori is not Apriori. It is missing all the nice optimzations
The current apriori implementation is really really slow.
It does not correctly implement the apriori-gen and pruning parts of Apriori. So it is not really apriori as it was published, but a naive approximation to it. The current codes uses (previous itemsets) x (items) to generate tuples, which will generate many false positives that the original Apriori would avoid to generate in the first place (and hence, not need to count later). For good performance, it is crucial to generate only very few candidates.
As such, this implementation benchmarks particularly poor on a data set with 33841 transactions and 1587 items, and minsupp=0.005 (Note that apyrori also does not correctly implement Apriori). Note that I wanted to use minsupp=0.001, but eventually interrupted mlxtend because it was too slow. So these results are now with 5x higher minsupp. Experiments were run on Google Colab, so no guarantees on the reliability of these measurements; but they repeatedly gave similar results with a similar data set before.
| Implementation | run time |
|---|---|
| mlxtend | 1min 55s |
| apyrori | 4min 2s |
| efficient-apriori | 2min 56s |
| ELKI (Java) | 7 s |
| pyFIM Apriori | 149 ms |
This is not just because pyFIM is a C library, but also because it really implements the Apriori algorithm with some of its clever optimizations (c.f. apriori-gen in the original paper; join+prune; as well as optimizations to counting each iteration). The ELKI implementation does not implement the hash tree of the original Apriori, so the 7 seconds could also be a bit better... And yes, I know that FPgrowth is much better. pyFIM has a very nice implementation of it, too; it took just 90 ms for this data.
For example at length 3 there are 2802 frequent itemsets, containing 719 different items. The mlxtend code will generate 2802*719=2014638 candidates from this. The real apriori algorithm joins 40095 cadidates, which are pruned to just 5737 candidates before going to the data again. So the mlxtend code has to check about 350x as many candidates as necessary compared to the original Apriori algorithm in this step. This will, of course, have a major impact on runtime.
Yes, it is currently a naive implementation and could be optimized.
@kno10 Are your benchmark scripts and data sets available somewhere?
A collection of standard benchmark data sets for FIM is here: http://fimi.uantwerpen.be/data/ An effort to benchmark these implementations should likely begin with this widely used collection.
That's a nice collection, thanks for sharing! Not sure what the sharing policies are exactly, but it would be interesting to incorporate one or two of the smaller ones in the unit tests as well. Also, for aiding the potential improvement of existing implementations, there could be a developer-script added to mlxtend.frequent_patterns for running a set of benchmarks based on these datasets. This could even be a function in mlxtend.frequent_patterns that users can call with different datasets that can be then fetched from the website above to assess the speed of these implementations on their local machine.