cudf icon indicating copy to clipboard operation
cudf copied to clipboard

Add an environment variable for handling fallback in cudf.pandas

Open Matt711 opened this issue 1 year ago • 12 comments

Description

This PR wraps up #14975 and extends PR #15837. It adds a fallback debugging mode to _fast_slow_function_call that returns warnings for different types of fallback that occur in cudf.pandas. The types of fallback covered are:

  • Out of memory errors, for the sake of planning No OOM related work
  • AttributeErrors for missing functionality
  • TypeErrors for differing function signatures

Checklist

  • [x] I am familiar with the Contributing Guidelines.
  • [x] New or existing tests cover these changes.
  • [ ] The documentation is up to date with these changes.

Matt711 avatar Jun 03 '24 22:06 Matt711

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

copy-pr-bot[bot] avatar Jun 03 '24 22:06 copy-pr-bot[bot]

/ok to test

Matt711 avatar Jun 10 '24 13:06 Matt711

/ok to test

Matt711 avatar Jun 10 '24 18:06 Matt711

/ok to test

Matt711 avatar Jun 10 '24 18:06 Matt711

/ok to test

Matt711 avatar Jun 10 '24 20:06 Matt711

/ok to test

Matt711 avatar Jun 11 '24 19:06 Matt711

Looks pretty good. Could you show a small example of how this would look when using the run-pandas-tests.sh (maybe just run it on one test file with a few tests)?

mroeschke avatar Jun 14 '24 01:06 mroeschke

/ok to test

Matt711 avatar Jun 14 '24 16:06 Matt711

/ok to test

Matt711 avatar Jun 14 '24 17:06 Matt711

Looks pretty good. Could you show a small example of how this would look when using the run-pandas-tests.sh (maybe just run it on one test file with a few tests)?

I'm not seeing warnings where I expect them, which makes me think the environment variable is being set when the pandas tests are run. This is the command I'm using. export CUDF_PANDAS_FALLBACK_DEBUGGING=True && python/cudf/cudf/pandas/scripts/run-pandas-tests.sh -n auto -v -p cudf.pandas tests/groupby/ | grep Warning

Matt711 avatar Jun 18 '24 12:06 Matt711

@Matt711 Are the warnings going to stderr? The pipe to grep will only capture stdout. You might need something like 2>&1 in there.

bdice avatar Jun 18 '24 14:06 bdice

@Matt711 Are the warnings going to stderr? The pipe to grep will only capture stdout. You might need something like 2>&1 in there.

test_groupby_agg_no_extra_calls should definitely return NotImplemented warnings, but I don't see them in stdout. I don't think the test are being run with cudf.pandas despite -p cudf.pandas being passed.

(rapids) coder ➜ ~/cudf $ pytest -v -p cudf.pandas ./pandas-testing/pandas-tests/tests/groupby/aggregate/test_aggregate.py::test_groupby_agg_no_extra_calls 2>&1
============================================================================================================================================ test session starts ============================================================================================================================================
platform linux -- Python 3.10.14, pytest-7.4.4, pluggy-1.5.0 -- /home/coder/.conda/envs/rapids/bin/python3.10
cachedir: .pytest_cache
hypothesis profile 'ci' -> deadline=None, suppress_health_check=[HealthCheck.too_slow, HealthCheck.differing_executors], database=DirectoryBasedExampleDatabase(PosixPath('/home/coder/cudf/.hypothesis/examples'))
benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /home/coder/cudf/pandas-testing/pandas-tests
configfile: pyproject.toml
plugins: anyio-4.4.0, hypothesis-6.103.2, benchmark-4.0.0, cases-3.8.5, cov-5.0.0, xdist-3.6.1
collected 1 item                                                                                                                                                                                                                                                                                            

pandas-testing/pandas-tests/tests/groupby/aggregate/test_aggregate.py::test_groupby_agg_no_extra_calls PASSED                                                                                                                                                                                         [100%]

============================================================================================================================================= 1 passed in 0.11s =============================================================================================================================================

Matt711 avatar Jun 20 '24 13:06 Matt711

Closing this PR in favor of #16161

Matt711 avatar Jul 12 '24 02:07 Matt711