Cleanup how args and kwargs are passed in `_fast_slow_function_call`
Description
I think kwargs are being subsumed by args in _fast_slow_function_call when they shouldn't.
Checklist
- [ ] I am familiar with the Contributing Guidelines.
- [ ] New or existing tests cover these changes.
- [ ] The documentation is up to date with these changes.
@Matt711 Could you explain what the issue is your are seeing?
In _fast_slow_function_call the kwargs are being included in args. I think this makes it so that kwargs (an therefore slow_kwargs and fast_kwargs) are always empty.
It's still a little unclear how this bug would manifest. For example, this still raises
In [1]: %load_ext cudf.pandas
In [2]: import pandas as pd
In [3]: pd.Timestamp(2020, 1, 1, year=2021)
---------------------------------------------------------------------------
Exception Traceback (most recent call last)
File ~/mroeschke-cudf/python/cudf/cudf/pandas/fast_slow_proxy.py:896, in _fast_slow_function_call(func, *args, **kwargs)
891 with nvtx.annotate(
892 "EXECUTE_FAST",
893 color=_CUDF_PANDAS_NVTX_COLORS["EXECUTE_FAST"],
894 domain="cudf_pandas",
895 ):
--> 896 fast_args, fast_kwargs = _fast_arg(args), _fast_arg(kwargs)
897 result = func(*fast_args, **fast_kwargs)
File ~/mroeschke-cudf/python/cudf/cudf/pandas/fast_slow_proxy.py:1046, in _fast_arg(arg)
1045 seen: set[int] = set()
-> 1046 return _transform_arg(arg, "_fsproxy_fast", seen)
File ~/mroeschke-cudf/python/cudf/cudf/pandas/fast_slow_proxy.py:973, in _transform_arg(arg, attribute_name, seen)
971 if type(arg) is tuple:
972 # Must come first to avoid infinite recursion
--> 973 return tuple(_transform_arg(a, attribute_name, seen) for a in arg)
974 elif hasattr(arg, "__getnewargs_ex__"):
975 # Partial implementation of to reconstruct with
976 # transformed pieces
977 # This handles scipy._lib._bunch._make_tuple_bunch
File ~/mroeschke-cudf/python/cudf/cudf/pandas/fast_slow_proxy.py:973, in <genexpr>(.0)
971 if type(arg) is tuple:
972 # Must come first to avoid infinite recursion
--> 973 return tuple(_transform_arg(a, attribute_name, seen) for a in arg)
974 elif hasattr(arg, "__getnewargs_ex__"):
975 # Partial implementation of to reconstruct with
976 # transformed pieces
977 # This handles scipy._lib._bunch._make_tuple_bunch
File ~/mroeschke-cudf/python/cudf/cudf/pandas/fast_slow_proxy.py:958, in _transform_arg(arg, attribute_name, seen)
957 if typ is _Unusable:
--> 958 raise Exception("Cannot transform _Unusable")
959 return typ
Exception: Cannot transform _Unusable
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call last)
Cell In[3], line 1
----> 1 pd.Timestamp(2020, 1, 1, year=2021)
File ~/mroeschke-cudf/python/cudf/cudf/pandas/_wrappers/pandas.py:131, in Timestamp_Timedelta__new__(cls, *args, **kwargs)
124 def Timestamp_Timedelta__new__(cls, *args, **kwargs):
125 # Call fast/slow constructor
126 # This takes care of running __init__ as well, but must be paired
(...)
129 # Timestamp & Timedelta don't always return same types as self,
130 # hence this method is needed.
--> 131 self, _ = _fast_slow_function_call(
132 lambda cls, args, kwargs: cls(*args, **kwargs),
133 cls,
134 args,
135 kwargs,
136 )
137 return self
File ~/mroeschke-cudf/python/cudf/cudf/pandas/fast_slow_proxy.py:941, in _fast_slow_function_call(func, *args, **kwargs)
939 slow_args, slow_kwargs = _slow_arg(args), _slow_arg(kwargs)
940 with disable_module_accelerator():
--> 941 result = func(*slow_args, **slow_kwargs)
942 return _maybe_wrap_result(result, func, *args, **kwargs), fast
File ~/mroeschke-cudf/python/cudf/cudf/pandas/_wrappers/pandas.py:132, in Timestamp_Timedelta__new__.<locals>.<lambda>(cls, args, kwargs)
124 def Timestamp_Timedelta__new__(cls, *args, **kwargs):
125 # Call fast/slow constructor
126 # This takes care of running __init__ as well, but must be paired
(...)
129 # Timestamp & Timedelta don't always return same types as self,
130 # hence this method is needed.
131 self, _ = _fast_slow_function_call(
--> 132 lambda cls, args, kwargs: cls(*args, **kwargs),
133 cls,
134 args,
135 kwargs,
136 )
137 return self
File timestamps.pyx:1725, in pandas._libs.tslibs.timestamps.Timestamp.__new__()
TypeError: __new__() got multiple values for keyword argument 'year'
pd.Timestamp(2020, 1, 1, year=2021)
Yeah I would expect that example to still raise. This PR just addresses the fact that kwargs are always empty.
pd.Timestamp(2020, 1, 1, year=2021)Yeah I would expect that example to still raise. This PR just addresses the fact that
kwargsare always empty.
@Matt711 Could you please add a couple of tests that are currently failing and pass with this fix?
@Matt711 Could you please add a couple of tests that are currently failing and pass with this fix?
I don't think there's a good way to test this PR because all it does is change how we call _fast_slow_function_call
- _fast_slow_function_call(func, args, kwargs)
- _fast_slow_function_call(func, *args, **kwargs)
With the second option, we unpack the args and kwargs so that kwargs are not always empty (so this is more of a general python thing).
I labeled this a bug but there's nothing really incorrect here in the sense of an incorrect result from _fast_slow_function_call(I can edit the description). I think it makes more sense to unpack the args and kwargs, so that kwargs aren't included in args. If we don't, kwargs and therefore fast_kwargs and slow_kwargs will always be empty. So _slow_arg(kwargs) is never doing anything meaningful. What do you think?
My apologies for any confusion. cc. @mroeschke
If there's not strict bug here, IMO I would just treat this as a "cleanup" and just remove **kwargs from _fast_slow_function_call if we never pass kwargs to it internally. Also minimizes a chance of another bug cropping up if splatting the kwargs has some unintended consequence we don't know about.
I agree with @mroeschke. No need to label this a bug, but let's simplify the API to reduce the possibility of error for devs in the future.
Looks like this needs some conflict resolution, and there are some discussions to be addressed @Matt711.
TODO: Check if this improvement PR is still relevant.