ibis icon indicating copy to clipboard operation
ibis copied to clipboard

feat: Table.pack()

Open NickCrews opened this issue 1 year ago • 0 comments

Is your feature request related to a problem?

In several places in my app, I have a 1:many relationship, and I want the manys to be rolled up into a single row:

from collections.abc import Iterable

import ibis
from ibis.expr.types import StructColumn, Table

from ibis import _

ibis.options.interactive = True
ibis.options.repr.interactive.max_depth = 5


def pack(t: Table, columns: str | Iterable[str] | None = None) -> StructColumn:
    """Pack several columns together into a single column of type struct.

    Inverse of Table.unpack(struct_col_name).
    """
    if columns is None:
        columns = t.columns
    columns = ibis.util.promote_list(columns)
    return ibis.struct({c: t[c] for c in columns})


orders = ibis.memtable(
    {
        "customer_id": [1, 2, 3, 1, 2, 3],
        "price": [100, 200, 300, 400, 500, 600],
        "SKU": ["A", "B", "A", "B", "A", "B"],
    }
)
customers = ibis.memtable(
    {
        "customer_id": [1, 2, 3],
        "name": ["Alice", "Bob", "Charlie"],
    }
)

orders_by_customer = (
    orders.select("customer_id", order=pack(_, ["price", "SKU"]))
    .group_by("customer_id")
    .agg(orders=_.order.collect())
)
customers.left_join(orders_by_customer, "customer_id").drop("customer_id_right")
# ┏━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
# ┃ customer_id ┃ name    ┃ orders                                                   ┃
# ┡━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
# │ int64       │ string  │ array<struct<price: int64, SKU: string>>                 │
# ├─────────────┼─────────┼──────────────────────────────────────────────────────────┤
# │           1 │ Alice   │ [{'price': 100, 'SKU': 'A'}, {'price': 400, 'SKU': 'B'}] │
# │           2 │ Bob     │ [{'price': 200, 'SKU': 'B'}, {'price': 500, 'SKU': 'A'}] │
# │           3 │ Charlie │ [{'price': 300, 'SKU': 'A'}, {'price': 600, 'SKU': 'B'}] │
# └─────────────┴─────────┴──────────────────────────────────────────────────────────┘

Describe the solution you'd like

It would be cool if Table.pack() was built in to ibis.

Possibly could improve this API to accept selectors, or Expressions instead of column names.

What version of ibis are you running?

main

What backend(s) are you using, if any?

duckdb

Code of Conduct

  • [X] I agree to follow this project's Code of Conduct

NickCrews avatar Jan 31 '24 18:01 NickCrews