vaex icon indicating copy to clipboard operation
vaex copied to clipboard

[BUG-REPORT] cannot call `unique` with multiple selections

Open Ben-Epstein opened this issue 1 year ago • 0 comments

Thank you for reaching out and helping us improve Vaex!

Before you submit a new Issue, please read through the documentation. Also, make sure you search through the Open and Closed Issues - your problem may already be discussed or addressed.

Description Please provide a clear and concise description of the problem. This should contain all the steps needed to reproduce the problem. A minimal code example that exposes the problem is very appreciated.

Software information

  • Vaex version (import vaex; vaex.__version__): 4.12.0
  • Vaex was installed via: pip / conda-forge / from source
  • OS: macos

Additional information

df = vaex.from_arrays(
    cbo_cluster_0=[True, False, True, True, True], 
    cbo_cluster_1=[False, False, True, False, True], 
    cbo_cluster_2=[False, False, False, False, False], 
    cbo_cluster_3=[True, True, True, True, True], 
    gold=["a","a","b","c","b"], 
    id=list(range(5))
)

df_copy = df.copy()
selection = []
avl_clusters = df_copy.get_column_names(regex=f"{VaexColumn.cbo_cluster}_")
for cluster in avl_clusters:
    df_copy.select(f"{cluster}==True", name=f"select_{cluster}")
    selection.append(f"select_{cluster}")

df_copy["id"].mean(selection=selection)  # fine
df_copy.count(selection=selection)  # fine
df_copy["gold"].unique(selection=selection[0])  # fine, returns a list
df_copy["gold"].unique(selection=selection)  # fails
 File ".venv/lib/python3.9/site-packages/vaex/execution.py", line 346, in execute_generator
    run = Run(tasks)
  File ".venv/lib/python3.9/site-packages/vaex/execution.py", line 87, in __init__
    selections = list(set(selection for task in tasks_df for selection in task.selections))
TypeError: unhashable type: 'list'

assuming the failure comes from the fact that its returning a list of lists

Ben-Epstein avatar Sep 01 '22 18:09 Ben-Epstein