mlxtend icon indicating copy to clipboard operation
mlxtend copied to clipboard

Fpgrowth fails with only one transaction

Open yhdelgado opened this issue 1 year ago • 0 comments

I have a big dataset with real data. After several attempts, the execution fails at one transaction. I isolated the transaction and re-executed the algorithm. Always fails. I can't understand why it fails at this point, even with the isolated transaction.

Example:

from mlxtend.preprocessing import TransactionEncoder
from mlxtend.frequent_patterns.fpgrowth import fpgrowth
import pandas as pd

transactions =[ [
    114367, 116953, 123213, 125589, 128047, 128579, 130407, 132025, 132082,
    134190, 136097, 136098, 136181, 136357, 136656, 136658, 136659, 136992,
    137180, 137181, 137395, 138215, 139339, 139520, 139551, 140008, 140012,
    140021
  ]]

def get_fpgrowth_associated_products(product_name):
  # filter out transactions that don't include the target product
  filtered_transactions = [t for t in transactions if product_name in t]
  te = TransactionEncoder()
  te_ary = te.fit(filtered_transactions).transform(filtered_transactions)

    # Convert the one-hot encoded array into a pandas DataFrame
  df = pd.DataFrame(te_ary, columns=te.columns_)

    # Compute frequent itemsets using the FP-growth algorithm (min_support = 0.5)
  freq_itemsets = fpgrowth(df, min_support=0.5, use_colnames=True)

  itemsets=set(freq_itemsets.itemsets)

    # find the sets that include the target product
  target_sets = [s for s in itemsets if product_name in s]

    # combine the other items from those sets into a single set
  associated_items = set()
  for s in target_sets:
      associated_items |= s - {product_name}
  
  return list(associated_items)

get_fpgrowth_associated_products(136181)

Versions

MLxtend 0.22.0 Linux-5.19.0-43-generic-x86_64-with-glibc2.35 Python 3.8.16 Scikit-learn 1.2.2 NumPy 1.24.3 SciPy 1.9.3

yhdelgado avatar Jun 06 '23 22:06 yhdelgado