machinelearninginaction
machinelearninginaction copied to clipboard
fix apriori algorithm aprioriGen
When converting sets to lists, the ordering is non-deterministic.
For example if the input to aprioriGen was: ([{0,1}, {1,2}, {2,0}], 3) If the lists stay in the same order as the sets appear, no two lists would have the same leading digit.
I think the optimisation only makes sense if you sort the lists first.
When converting sets to lists, the ordering is non-deterministic.
For example if the input to aprioriGen was: ([{0,1}, {1,2}, {2,0}], 3) If the lists stay in the same order as the sets appear, no two lists would have the same leading digit.
I think the optimisation only makes sense if you sort the lists first.
I tried this set [{0,1},{1,2},{2,0}] with python3.6 in PyCharm, I found the set was sorted to [{0,1},{1,2},{0,2}] automaticly. and the set[{0,1,2}] was append to the retList. But sort the lists before compare lists[:k-2] is better than the origin anyway.