David Wendt

Results 57 issues of David Wendt

## Description Replaces the `cudf::strings::detail::make_strings_children` with the new `cudf::strings::detail::experimental::make_strings_children`. No code logic has changed -- just code moved around. All current code was already using the experimental function. ## Checklist...

3 - Ready for Review
libcudf
improvement
non-breaking

## Description Fix logic for `nvtext::character_tokenize` to handle large strings input. The output for > 2GB input strings column will turn characters into rows and so will likely overflow the...

bug
2 - In Progress
libcudf
strings
non-breaking

## Description Improves performance for `nvtext::replace_tokens` for long strings. ## Checklist - [x] I am familiar with the [Contributing Guidelines](https://github.com/rapidsai/cudf/blob/HEAD/CONTRIBUTING.md). - [x] New or existing tests cover these changes. -...

3 - Ready for Review
libcudf
strings
improvement
non-breaking

## Description Fixes calls to `thrust::count_if` in strings split APIs to better handle large strings. ## Checklist - [x] I am familiar with the [Contributing Guidelines](https://github.com/rapidsai/cudf/blob/HEAD/CONTRIBUTING.md). - [x] New or...

2 - In Progress
libcudf
CMake
strings
improvement
non-breaking

## Description Replaces `thrust::count_if` with raw kernel counter to handle large strings (int64 offsets) and > 2GB strings columns. ## Checklist - [x] I am familiar with the [Contributing Guidelines](https://github.com/rapidsai/cudf/blob/HEAD/CONTRIBUTING.md)....

bug
2 - In Progress
libcudf
strings
non-breaking

I've been working on improvements to the csv-writer. The changes may require multiple PRs and are as follows: 1. The current implementation formats the CSV (in row chunks) into CPU...

feature request
libcudf
cuIO
Performance

**Describe the issue** Nightly builds are failing due to memcheck errors in specific gtests. The error appears to be `compute-sanitizer` tool issue which has been opened as nvbug 4553815. This...

bug

## Description Adds `CUDF_LARGE_STRINGS_ENABLED` compile-time option to enable large strings support. This changes the default behavior of the `LIBCUDF_LARGE_STRINGS_ENABLED` environment variable -- when the variable is not set. If `CUDF_LARGE_STRINGS_ENABLED`...

3 - Ready for Review
libcudf
Python
CMake
improvement
non-breaking

## Description The `cudf::test::fixed_width_column_wrapper` supports all fixed-width type including fixed-point types. However, there is no mechanism to specify the fixed-point scale value which is common for the entire column and...

bug
3 - Ready for Review
libcudf
non-breaking

## Description Adds stream support the `cudf::io::text::multibyte_split` API. Also adds a stream test and deprecates an overloaded API. ## Checklist - [x] I am familiar with the [Contributing Guidelines](https://github.com/rapidsai/cudf/blob/HEAD/CONTRIBUTING.md). -...

3 - Ready for Review
libcudf
CMake
improvement
non-breaking