[FEA] Enable tracing of proxied objects in cudf.pandas
Description
Closes #14502. I'm trying to imagine what this could look like.
Checklist
- [ ] I am familiar with the Contributing Guidelines.
- [ ] New or existing tests cover these changes.
- [ ] The documentation is up to date with these changes.
import pandas as pd
data = {
'player_id': [1, 2, 3, 4, 5, 6],
'player_name': ['Tiger Woods', 'Rory McIlroy', 'Phil Mickelson', 'Jordan Spieth', 'Justin Thomas', 'Dustin Johnson'],
'country': ['USA', 'UK', 'USA', 'USA', 'USA', 'USA'],
'score': [70, 68, 72, 74, 69, 71]
}
df = pd.DataFrame(data)
result = df.groupby('country').agg(
total_score=('score', 'sum'),
average_score=('score', 'mean'),
count_players=('player_id', 'size')
).reset_index()
+ Call to <lambda> with args=<class 'cudf.core.dataframe.DataFrame'> kwargs=dict_keys([]) (FAST PATH)
+ Call to call_operator with args=<function DataFrame.groupby at 0x7f93a7f88940> kwargs=dict_keys([]) (FAST PATH)
- Call to call_operator with args=<function DataFrameGroupBy.aggregate at 0x7f93bec3f5b0> kwargs=dict_keys([]) (SLOW PATH)
+ Call to call_operator with args=<function DataFrame.reset_index at 0x7f93a7fcf6d0> kwargs=dict_keys([]) (FAST PATH)
This pull request requires additional validation before any workflows can run on NVIDIA's runners.
Pull request vetters can view their responsibilities here.
Contributors can view more details about this message here.
I've never played around with these APIs, but if I understand the issue well enough you may want to see if sys.settrace and/or sys.call_tracing could be used here.
Let's consider exactly what information we want to include before going too deep into the implementation here. @wence- I believe you originally made this request, WDYT?
I'm closing this PR until we have more offline discussion about the approach we want to take.