fp-growth icon indicating copy to clipboard operation
fp-growth copied to clipboard

the whole algorithm is totally a mistake!

Open TtCWH opened this issue 6 years ago • 3 comments

totally!

TtCWH avatar Dec 09 '18 04:12 TtCWH

I don't agree. I have used this module with the modifications that I mention to avoid rule overwrite and it is very fast and accurate allowing a high degree of granularity in generating association rules even in very large databases. . @evandempsey I propose to fork this module to include the modifications that I suggested re rule overwrite. Have you any comment on this?

Rbain2 avatar Dec 10 '18 12:12 Rbain2

I don't agree. I have used this module with the modifications that I mention to avoid rule overwrite and it is very fast and accurate allowing a high degree of granularity in generating association rules even in very large databases. . @evandempsey I propose to fork this module to include the modifications that I suggested re rule overwrite. Have you any comment on this?

I have tried some different versions of FP-growth algorithm but fail to find a valid module , could u share your version with me?Thank u very much!

TtCWH avatar Dec 11 '18 06:12 TtCWH

OK My code has transitioned a lot from the original algorithm so took me a while to get back to understand exactly what I am currently using. I did start out using Evan's algorithm. but am now in fact using the frequent item module found at the link below and have combined this with Evan's association rules module modified as per below to avoid rule overwrite.

https://github.com/vukk/amdm-fpgrowth-python/blob/master/fpgrowth.py

Modified association rule module below

def generate_association_rules(patterns, confidence): """ Given a set of frequent itemsets, return a dict of association rules in the form {(left): ((right), confidence)} """ rules = {} for itemset in patterns.keys(): #print "itemset in patterns.keys",itemset,"patterns[itemset]",patterns[itemset] upper_support = patterns[itemset]

    for i in range(1, len(itemset)):
        for antecedent in itertools.combinations(itemset, i):
            antecedent = tuple(sorted(antecedent))
            consequent = tuple(sorted(set(itemset) - set(antecedent)))

            if antecedent in patterns:

            
               
                lower_support = patterns[antecedent]
                confidence = float(upper_support) / lower_support

                if confidence >= confidence_threshold:
                 
                        rule1 = (consequent, confidence)
                        rule1 = list(rule1)
                        if antecedent in rules:
                            
                            rules[antecedent].append(rule1)
                        else:
                            rules[antecedent] = rule1


return rules

Rbain2 avatar Dec 15 '18 14:12 Rbain2