cudf
cudf copied to clipboard
Implemented UDF Filters
Description
Checklist
- [ ] I am familiar with the Contributing Guidelines.
- [ ] New or existing tests cover these changes.
- [ ] The documentation is up to date with these changes.
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.
Contributors can view more details about this message here.
/ok to test 7f97737
I'll follow up with benchmarks once this is merged
std::vector<std::unique_ptr<column>> filter(
std::vector<column_view> const& columns,
std::string const& predicate_udf,
bool is_ptx,
std::optional<void*> user_data = std::nullopt,
std::optional<std::vector<bool>> copy_mask = std::nullopt,
rmm::cuda_stream_view stream = cudf::get_default_stream(),
rmm::device_async_resource_ref mr = cudf::get_current_device_resource_ref());
@davidwendt do you think this should be consistent with other APIs and use a table_view instead?
i.e:
std::unique_ptr<table> filter(
table_view const& table,
std::string const& predicate_udf,
bool is_ptx,
std::optional<void*> user_data = std::nullopt,
std::optional<std::vector<bool>> copy_mask = std::nullopt,
rmm::cuda_stream_view stream = cudf::get_default_stream(),
rmm::device_async_resource_ref mr = cudf::get_current_device_resource_ref());
do you think this should be consistent with other APIs and use a
table_viewinstead?
Yes, a table_view makes sense. Also since all the columns have to be the same size.
Yes, a table_view makes sense. Also since all the columns have to be the same size.
Just remembered it won't work since it can have scalars as input as well. I think we can proceed in its current state and make necessary changes later.
CI is taking forever
/merge
/merge