fp-growth
fp-growth copied to clipboard
Missing patterns
When using find_frequent_patterns(), certain itemsets are missing in the patterns although they have frequence above the threshold support provided, hence also causing certain rules to be missing. For example, consider the following dataset:
Transaction ID | Items_bought |
---|---|
T100 | {M, O, N, K, E, Y} |
T200 | {D, O, N, K, E, Y} |
T300 | {M, A, K, E} |
T400 | {M, U, C, K, Y} |
T500 | {C, O, O, K, I, E} |
I used the following code to find frequent patterns:
import pyfpgrowth as fp
items_bought = [['M','O','N','K','E','Y'], ['D','O','N','K','E','Y'], ['M','A','K','E'],
['M','U','C','K','Y'], ['C','O','O','K','I','E']]
min_support_count = 3
min_conf = 0.8
freq_patterns = fp.find_frequent_patterns(items_bought, min_support_count)
print(freq_patterns)
I get the following output:
{('M',): 3, ('K', 'M'): 3, ('Y',): 3, ('K', 'Y'): 3, ('O',): 4, ('K', 'O'): 4, ('E', 'O'): 4, ('E', 'K'): 4, ('E', 'K', 'O'): 4, ('K',): 5}
This output is missing the itemset ('E',) with frequency of 4 (> threshold of 3). Due to this, two rules are also missing when generating association rules. Please look into the issue!!
Did you have solution, I have same problem