David Wendt
David Wendt
## Description Replaces the `cudf::strings::detail::make_strings_children` with the new `cudf::strings::detail::experimental::make_strings_children`. No code logic has changed -- just code moved around. All current code was already using the experimental function. ## Checklist...
## Description Fix logic for `nvtext::character_tokenize` to handle large strings input. The output for > 2GB input strings column will turn characters into rows and so will likely overflow the...
## Description Improves performance for `nvtext::replace_tokens` for long strings. ## Checklist - [x] I am familiar with the [Contributing Guidelines](https://github.com/rapidsai/cudf/blob/HEAD/CONTRIBUTING.md). - [x] New or existing tests cover these changes. -...
## Description Fixes calls to `thrust::count_if` in strings split APIs to better handle large strings. ## Checklist - [x] I am familiar with the [Contributing Guidelines](https://github.com/rapidsai/cudf/blob/HEAD/CONTRIBUTING.md). - [x] New or...
## Description Replaces `thrust::count_if` with raw kernel counter to handle large strings (int64 offsets) and > 2GB strings columns. ## Checklist - [x] I am familiar with the [Contributing Guidelines](https://github.com/rapidsai/cudf/blob/HEAD/CONTRIBUTING.md)....
I've been working on improvements to the csv-writer. The changes may require multiple PRs and are as follows: 1. The current implementation formats the CSV (in row chunks) into CPU...
**Describe the issue** Nightly builds are failing due to memcheck errors in specific gtests. The error appears to be `compute-sanitizer` tool issue which has been opened as nvbug 4553815. This...
## Description Adds `CUDF_LARGE_STRINGS_ENABLED` compile-time option to enable large strings support. This changes the default behavior of the `LIBCUDF_LARGE_STRINGS_ENABLED` environment variable -- when the variable is not set. If `CUDF_LARGE_STRINGS_ENABLED`...
## Description The `cudf::test::fixed_width_column_wrapper` supports all fixed-width type including fixed-point types. However, there is no mechanism to specify the fixed-point scale value which is common for the entire column and...
## Description Adds stream support the `cudf::io::text::multibyte_split` API. Also adds a stream test and deprecates an overloaded API. ## Checklist - [x] I am familiar with the [Contributing Guidelines](https://github.com/rapidsai/cudf/blob/HEAD/CONTRIBUTING.md). -...