siuba
siuba copied to clipboard
An object should be able to raise an error, if it doesn't have a verb implementation
Note: if implemented, this change seems impactful enough that an ADR should be written for it
For example, in the example below I create a LazyTbl for mtcars data. However, since semi_join
is currently not implemented in SQL, calling semi_join(tbl_mtcars, tbl_mtcars, {"cyl": "cyl"})
results in a Pipeable.
Ideally, an object should be able to declare that by default, if it does not have a method to dispatch for a verb, that it wants to produce an error.
Options
- Register globally
- Register on each verb
- Search for custom method (preferred choice)
For option (3), siu could search for a (class?) method named _siu_dispatch_default
, and if it exists, call it.
Impact
Overall, I think this will be a very important behavior. It brings stability to pipes, which provide very clean syntax, but can fail in confusing / unintuitive ways. It uses a similar approach to jupyter notebooks, so hopefully will not be too surprising.
Example
from sqlalchemy import create_engine
from siuba.data import mtcars
import pandas as pd
engine = create_engine('sqlite:///:memory:', echo=False)
# note that mtcars is a pandas DataFrame
mtcars.to_sql('mtcars', engine)
from siuba import semi_join
from siuba.sql import LazyTbl, show_query, collect
tbl_mtcars = LazyTbl(engine, 'mtcars')
semi_join(tbl_mtcars, tbl_mtcars, {"cyl": "cyl"})
Another option for siuba classes, like LazyTbl, is that they could subclass something that when dispatched raises an error
A nice way to solve this would be to subclass ABCMeta
from abc import ABCMeta
class SiuTable(metaclass = ABCMeta):
pass
# in siuba.dply.verbs.py
import pandas as pd
SiuTable.register(pd.DataFrame)
# in siuba.sql.verbs
SiuTable.register(LazyTbl)
Then, the default for singledispatch2 could be to dispatch on SiuTable and raise an error...
See
- #257
- #350