ibis icon indicating copy to clipboard operation
ibis copied to clipboard

feat: add support for ibis.date(y, m, d) using deferred

Open p-a-a-a-trick opened this issue 3 years ago • 4 comments

ibis.date errors out if you refer to parent expressions using _.

Expected: can pass _['year'], _['month'], _['day'] to ibis.date to get a date expressions, just as you would t['year'], t['month'], t['day']

Actual: type error

MWE:

import ibis
from ibis import _
import pandas as pd

cols = ['date_id', 'date_year', 'date_month', 'date_day']
vals = [[1, 2021, 8, 4], [2, 2021, 8, 26], [3, 2022, 8, 3], [4, 2022, 8, 25]]

df = pd.DataFrame(vals, columns=cols)

conn = ibis.pandas.connect({'dates': df})
dates_base = conn.table("dates")

# Works
dates_works = (
    dates_base
    .mutate(date_value=ibis.date(dates_base['date_year'], dates_base['date_month'], dates_base['date_day']))
)

# Does not works
dates_notworks = (
    dates_base
    .mutate(date_value=ibis.date(_['date_year'], _['date_month'], _['date_day']))
)

Error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In [1], line 22
     14 dates_works = (
     15     dates_base
     16     .mutate(date_value=ibis.date(dates_base['date_year'], dates_base['date_month'], dates_base['date_day']))
     17 )
     19 # Does not works
     20 dates_notworks = (
     21     dates_base
---> 22     .mutate(date_value=ibis.date(_['date_year'], _['date_month'], _['date_day']))
     23 )

File /usr/lib/python3.10/functools.py:889, in singledispatch.<locals>.wrapper(*args, **kw)
    885 if not args:
    886     raise TypeError(f'{funcname} requires at least '
    887                     '1 positional argument')
--> 889 return dispatch(args[0].__class__)(*args, **kw)

TypeError: _date_from_deferred() takes 1 positional argument but 3 were given

p-a-a-a-trick avatar Sep 19 '22 12:09 p-a-a-a-trick

Ah, so this is only implemented for the use case of ibis.date(thing) -> thing.date(), extracting the date from a non-date column. We haven't implemented the variadic case of date. This would be a new feature.

cpcloud avatar Sep 19 '22 12:09 cpcloud

This is a tricky, but possibly fun issue to address. Deferred instances are not supported as inputs to ops.Nodes because ops.Nodes eagerly check their inputs' type and Deferred instances' types are not known until their .resolve method is called.

It's not clear to me exactly how to make these two things--Deferred and ops.Node behavior--work well together.

cpcloud avatar Sep 19 '22 16:09 cpcloud

@cpcloud Postgres works in some cases. Using the same data uploaded to postgres:

dates_base = conn.table("dates")

dates = (
    dates_base
    .mutate(date_value=ibis.date(dates_base['date_year'], dates_base['date_month'], dates_base['date_day']))
)

yields date_value with the correct date value

p-a-a-a-trick avatar Sep 19 '22 16:09 p-a-a-a-trick

xref #4382

cpcloud avatar Sep 20 '22 20:09 cpcloud

Closing in favor of #4382.

cpcloud avatar Mar 16 '23 13:03 cpcloud