Shruti Shivakumar

Results 14 issues of Shruti Shivakumar

## Description This work is a follow-up to PR #14931 which provided a proof-of-concept for using the a FST to normalize unquoted whitespaces. This PR implements the pre-processing FST in...

feature request
2 - In Progress
libcudf
CMake
cuDF (Java)
non-breaking

## Description This piece of work seeks to achieve two goals - (i) reducing repeated reading of byte range chunks in the JSON reader, and (ii) enabling multi-source byte range...

libcudf
Performance
improvement
non-breaking

## Description Addresses #15277 Given a JSON lines buffer with records separated by a delimiter passed at runtime, the idea is to modify the JSON tokenization FST to consider the...

feature request
libcudf
cuIO
non-breaking

## Description This PR fixes the number of bytes read and corrects the offsets for the delimiters added to the buffer when reading across multiple sources. ## Checklist - [X]...

bug
libcudf
cuIO
non-breaking

## Description This PR cleans up the JSON reader options benchmark by reducing the number of runtime configurations from 162 to 20. Reasoning behind the splitting of the benchmark -...

libcudf
Performance
improvement
non-breaking

## Description Part of #15903. 1. Introduces the Compressed Sparse Row (CSR) format to store the adjacency information of the column tree. 2. Analogous to `reduce_to_column_tree`, `reduce_to_column_tree_csr` reduces node tree...

libcudf
CMake
cuIO
improvement
non-breaking

## Description Coming soon. ## Checklist - [ ] I am familiar with the [Contributing Guidelines](https://github.com/rapidsai/cudf/blob/HEAD/CONTRIBUTING.md). - [ ] New or existing tests cover these changes. - [ ] The...

libcudf
CMake
cuIO
improvement
non-breaking

## Description Addresses #16999 ## Checklist - [X] I am familiar with the [Contributing Guidelines](https://github.com/rapidsai/cudf/blob/HEAD/CONTRIBUTING.md). - [X] New or existing tests cover these changes. - [ ] The documentation is...

bug
libcudf
cuIO
non-breaking

## Description The full push-down automata that tokenizes the input JSON string, as well as the bracket-brace FST over-estimates the total buffer size required for the translated output and indices....

libcudf
cuIO
Performance
improvement
non-breaking

**Describe the bug** With the implementation of the reallocate-and-retry logic when the initial buffer size estimate fails for byte range reading (PR #16687), the total buffer size read per batch...

bug