cudf
cudf copied to clipboard
JSON parser integration
Description
Integrates the experimental full gpu json parser. Replaces existing cpu json parser(+ gpu tokenizer)
All test passes and no cuda memory errors.
Benchmark Results
nested_json_gpu_parser
[0] Quadro GV100
| string_size | Samples | CPU Time | Noise | GPU Time | Noise | Elem/s |
|---|---|---|---|---|---|---|
| 2^20 = 1048576 | 1976x | 7.572 ms | 7.25% | 7.565 ms | 7.25% | 138.608M |
| 2^21 = 2097152 | 256x | 9.088 ms | 15.72% | 9.082 ms | 15.72% | 230.924M |
| 2^22 = 4194304 | 880x | 11.419 ms | 14.07% | 11.413 ms | 14.07% | 367.518M |
| 2^23 = 8388608 | 80x | 17.619 ms | 9.65% | 17.612 ms | 9.65% | 476.293M |
| 2^24 = 16777216 | 500x | 30.022 ms | 5.33% | 30.015 ms | 5.33% | 558.957M |
| 2^25 = 33554432 | 286x | 52.556 ms | 4.52% | 52.549 ms | 4.52% | 638.533M |
| 2^26 = 67108864 | 11x | 100.289 ms | 0.40% | 100.282 ms | 0.40% | 669.200M |
| 2^27 = 134217728 | 77x | 195.659 ms | 1.45% | 195.654 ms | 1.45% | 685.994M |
| 2^28 = 268435456 | 39x | 387.014 ms | 0.58% | 387.013 ms | 0.58% | 693.609M |
| 2^29 = 536870912 | 11x | 773.444 ms | 0.38% | 773.448 ms | 0.38% | 694.126M |
| 2^30 = 1073741824 | 10x | 1.554 s | 0.29% | 1.554 s | 0.29% | 691.001M |
Checklist
- [x] I am familiar with the Contributing Guidelines.
- [ ] New or existing tests cover these changes.
- [ ] The documentation is up to date with these changes.
Codecov Report
:exclamation: No coverage uploaded for pull request base (
branch-22.10@466a90d). Click here to learn what that means. Patch has no changes to coverable lines.
:exclamation: Current head 87f5575 differs from pull request most recent head 9486c20. Consider uploading reports for the commit 9486c20 to get more accurate results
Additional details and impacted files
@@ Coverage Diff @@
## branch-22.10 #11717 +/- ##
===============================================
Coverage ? 87.51%
===============================================
Files ? 133
Lines ? 21803
Branches ? 0
===============================================
Hits ? 19080
Misses ? 2723
Partials ? 0
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.
:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
Benchmark Results
Dated 19-Sept-2022
nested_json_gpu_parser
[0] Quadro GV100
| string_size | Samples | CPU Time | Noise | GPU Time | Noise | Elem/s |
|---|---|---|---|---|---|---|
| 2^20 = 1048576 | 1976x | 7.572 ms | 7.25% | 7.565 ms | 7.25% | 138.608M |
| 2^21 = 2097152 | 256x | 9.088 ms | 15.72% | 9.082 ms | 15.72% | 230.924M |
| 2^22 = 4194304 | 880x | 11.419 ms | 14.07% | 11.413 ms | 14.07% | 367.518M |
| 2^23 = 8388608 | 80x | 17.619 ms | 9.65% | 17.612 ms | 9.65% | 476.293M |
| 2^24 = 16777216 | 500x | 30.022 ms | 5.33% | 30.015 ms | 5.33% | 558.957M |
| 2^25 = 33554432 | 286x | 52.556 ms | 4.52% | 52.549 ms | 4.52% | 638.533M |
| 2^26 = 67108864 | 11x | 100.289 ms | 0.40% | 100.282 ms | 0.40% | 669.200M |
| 2^27 = 134217728 | 77x | 195.659 ms | 1.45% | 195.654 ms | 1.45% | 685.994M |
| 2^28 = 268435456 | 39x | 387.014 ms | 0.58% | 387.013 ms | 0.58% | 693.609M |
| 2^29 = 536870912 | 11x | 773.444 ms | 0.38% | 773.448 ms | 0.38% | 694.126M |
| 2^30 = 1073741824 | 10x | 1.554 s | 0.29% | 1.554 s | 0.29% | 691.001M |
@karthikeyann NC_VAL to NC_STR in token to tree conversion itself c89a982 this effectively remove NC_VAL in tree nodes completely context: duration types may have mixed literal and string types. JSON_TEST JsonReaderParamTest Durations failed because it ignored the literals.
(Note: Idea, it could be done without loosing NC_VAL by hashing NC_VAL and NC_STR to same node_type in hashing in tree traversal code).
Benchmark Results
Dated 21-Sept-2022
nested_json_gpu_parser
[0] Quadro GV100
| string_size | Samples | CPU Time | Noise | GPU Time | Noise | Elem/s |
|---|---|---|---|---|---|---|
| 2^20 = 1048576 | 2501x | 5.975 ms | 7.89% | 5.968 ms | 7.88% | 175.689M |
| 2^21 = 2097152 | 672x | 6.677 ms | 4.13% | 6.670 ms | 4.13% | 314.402M |
| 2^22 = 4194304 | 592x | 8.323 ms | 4.69% | 8.316 ms | 4.69% | 504.346M |
| 2^23 = 8388608 | 1008x | 12.778 ms | 13.18% | 12.771 ms | 13.18% | 656.833M |
| 2^24 = 16777216 | 750x | 19.987 ms | 5.99% | 19.980 ms | 5.99% | 839.719M |
| 2^25 = 33554432 | 80x | 37.238 ms | 4.29% | 37.230 ms | 4.29% | 901.272M |
| 2^26 = 67108864 | 220x | 68.162 ms | 1.98% | 68.154 ms | 1.98% | 984.658M |
| 2^27 = 134217728 | 80x | 131.745 ms | 1.09% | 131.738 ms | 1.09% | 1.019G |
| 2^28 = 268435456 | 11x | 260.182 ms | 0.35% | 260.175 ms | 0.35% | 1.032G |
| 2^29 = 536870912 | 11x | 520.952 ms | 0.19% | 520.946 ms | 0.19% | 1.031G |
| 2^30 = 1073741824 | 11x | 1.057 s | 0.27% | 1.057 s | 0.27% | 1.016G |
Benchmark Results
Dated 27-Sept-2022
nested_json_gpu_parser
[0] Quadro GV100
| string_size | Samples | CPU Time | Noise | GPU Time | Noise | Elem/s | bytes_per_second | peak_memory_usage |
|---|---|---|---|---|---|---|---|---|
| 2^20 = 1048576 | 2192x | 5.784 ms | 11.10% | 5.777 ms | 11.10% | 181.518M | 181518461 | 17.149 MiB |
| 2^21 = 2097152 | 960x | 6.158 ms | 5.10% | 6.150 ms | 5.09% | 340.980M | 340979979 | 34.106 MiB |
| 2^22 = 4194304 | 192x | 7.798 ms | 15.82% | 7.791 ms | 15.82% | 538.363M | 538362570 | 68.020 MiB |
| 2^23 = 8388608 | 848x | 10.036 ms | 4.47% | 10.029 ms | 4.47% | 836.432M | 836432150 | 135.848 MiB |
| 2^24 = 16777216 | 80x | 16.555 ms | 9.49% | 16.547 ms | 9.49% | 1.014G | 1013889148 | 271.504 MiB |
| 2^25 = 33554432 | 552x | 27.182 ms | 7.52% | 27.175 ms | 7.52% | 1.235G | 1234774323 | 542.817 MiB |
| 2^26 = 67108864 | 80x | 48.934 ms | 3.19% | 48.927 ms | 3.19% | 1.372G | 1371621284 | 1.060 GiB |
| 2^27 = 134217728 | 166x | 90.397 ms | 1.64% | 90.390 ms | 1.64% | 1.485G | 1484872039 | 2.120 GiB |
| 2^28 = 268435456 | 87x | 172.521 ms | 0.99% | 172.514 ms | 0.99% | 1.556G | 1556019625 | 4.239 GiB |
| 2^29 = 536870912 | 43x | 351.175 ms | 0.77% | 351.170 ms | 0.77% | 1.529G | 1528807741 | 8.479 GiB |
| 2^30 = 1073741824 | 11x | 667.835 ms | 0.18% | 667.831 ms | 0.18% | 1.608G | 1607803624 | 16.957 GiB |