fp-growth icon indicating copy to clipboard operation
fp-growth copied to clipboard

Missing patterns

Open Aditya-1500 opened this issue 4 years ago • 1 comments

When using find_frequent_patterns(), certain itemsets are missing in the patterns although they have frequence above the threshold support provided, hence also causing certain rules to be missing. For example, consider the following dataset:

Transaction ID Items_bought
T100 {M, O, N, K, E, Y}
T200 {D, O, N, K, E, Y}
T300 {M, A, K, E}
T400 {M, U, C, K, Y}
T500 {C, O, O, K, I, E}

I used the following code to find frequent patterns:

import pyfpgrowth as fp
items_bought = [['M','O','N','K','E','Y'], ['D','O','N','K','E','Y'], ['M','A','K','E'], 
                ['M','U','C','K','Y'], ['C','O','O','K','I','E']]
min_support_count = 3
min_conf = 0.8
freq_patterns = fp.find_frequent_patterns(items_bought, min_support_count)
print(freq_patterns)

I get the following output:

{('M',): 3, ('K', 'M'): 3, ('Y',): 3, ('K', 'Y'): 3, ('O',): 4, ('K', 'O'): 4, ('E', 'O'): 4, ('E', 'K'): 4, ('E', 'K', 'O'): 4, ('K',): 5}

This output is missing the itemset ('E',) with frequency of 4 (> threshold of 3). Due to this, two rules are also missing when generating association rules. Please look into the issue!!

Aditya-1500 avatar Feb 15 '21 16:02 Aditya-1500

Did you have solution, I have same problem

npk7264 avatar Apr 04 '23 13:04 npk7264