fp-growth
fp-growth copied to clipboard
the whole algorithm is totally a mistake!
totally!
I don't agree. I have used this module with the modifications that I mention to avoid rule overwrite and it is very fast and accurate allowing a high degree of granularity in generating association rules even in very large databases. . @evandempsey I propose to fork this module to include the modifications that I suggested re rule overwrite. Have you any comment on this?
I don't agree. I have used this module with the modifications that I mention to avoid rule overwrite and it is very fast and accurate allowing a high degree of granularity in generating association rules even in very large databases. . @evandempsey I propose to fork this module to include the modifications that I suggested re rule overwrite. Have you any comment on this?
I have tried some different versions of FP-growth algorithm but fail to find a valid module , could u share your version with me?Thank u very much!
OK My code has transitioned a lot from the original algorithm so took me a while to get back to understand exactly what I am currently using. I did start out using Evan's algorithm. but am now in fact using the frequent item module found at the link below and have combined this with Evan's association rules module modified as per below to avoid rule overwrite.
https://github.com/vukk/amdm-fpgrowth-python/blob/master/fpgrowth.py
Modified association rule module below
def generate_association_rules(patterns, confidence): """ Given a set of frequent itemsets, return a dict of association rules in the form {(left): ((right), confidence)} """ rules = {} for itemset in patterns.keys(): #print "itemset in patterns.keys",itemset,"patterns[itemset]",patterns[itemset] upper_support = patterns[itemset]
for i in range(1, len(itemset)):
for antecedent in itertools.combinations(itemset, i):
antecedent = tuple(sorted(antecedent))
consequent = tuple(sorted(set(itemset) - set(antecedent)))
if antecedent in patterns:
lower_support = patterns[antecedent]
confidence = float(upper_support) / lower_support
if confidence >= confidence_threshold:
rule1 = (consequent, confidence)
rule1 = list(rule1)
if antecedent in rules:
rules[antecedent].append(rule1)
else:
rules[antecedent] = rule1
return rules