serde/json: experimental asynchronous streaming parser
Backports Required
- [x] none - not a bug fix
- [ ] none - this is a backport
- [ ] none - issue does not exist in previous branches
- [ ] none - papercut/not impactful enough to backport
- [ ] v25.1.x
- [ ] v24.3.x
- [ ] v24.2.x
Release Notes
- none
:tada: Snyk checks have passed. No issues have been found so far.
:white_check_mark: security/snyk check is complete. No issues have been found. (View Details)
:white_check_mark: license/snyk check is complete. No issues have been found. (View Details)
:tada: Snyk checks have passed. No issues have been found so far.
:white_check_mark: security/snyk check is complete. No issues have been found. (View Details)
:white_check_mark: license/snyk check is complete. No issues have been found. (View Details)
:tada: Snyk checks have passed. No issues have been found so far.
:white_check_mark: security/snyk check is complete. No issues have been found. (View Details)
:white_check_mark: license/snyk check is complete. No issues have been found. (View Details)
CI test results
test results on build#65426
| test_id | test_kind | job_url | test_status | passed |
|---|---|---|---|---|
| rptest.tests.scaling_up_test.ScalingUpTest.test_scaling_up_with_recovered_topic | ducktape | https://buildkite.com/redpanda/redpanda/builds/65426#01968e36-91bb-4151-a04a-89f0792f5bd0 | FLAKY | 15/21 |
| rptest.tests.tiered_storage_model_test.TieredStorageTest.test_tiered_storage.cloud_storage_type_and_url_style=.CloudStorageType.S3.1.virtual_host.test_case=.TS_Read==True.TS_TxRangeMaterialized==True | ducktape | https://buildkite.com/redpanda/redpanda/builds/65426#01968e49-4fe2-4eb2-8555-d34d355d66ba | FAIL | 0/1 |
| rptest.tests.tiered_storage_pause_test.TestTieredStoragePause.test_safe_pause_resume.allow_gaps_topic_level=None.allow_gaps_cluster_level=True | ducktape | https://buildkite.com/redpanda/redpanda/builds/65426#01968e36-91bc-447a-ac0e-afa5ff963584 | FLAKY | 20/21 |
| rptest.tests.upgrade_test.UpgradeBackToBackTest.test_upgrade_with_all_workloads.single_upgrade=False | ducktape | https://buildkite.com/redpanda/redpanda/builds/65426#01968e49-4fe1-4d99-a59a-a0dcf8ac7cb9 | FLAKY | 12/21 |
| rptest.transactions.consumer_offsets_test.VerifyConsumerOffsetsThruUpgrades.test_consumer_group_offsets.versions_to_upgrade=1 | ducktape | https://buildkite.com/redpanda/redpanda/builds/65426#01968e36-91bd-4185-be09-f6254efc44ab | FLAKY | 20/21 |
| rptest.transactions.producers_api_test.ProducersAdminAPITest.test_producers_state_api_during_load | ducktape | https://buildkite.com/redpanda/redpanda/builds/65426#01968e49-4fe1-4d99-a59a-a0dcf8ac7cb9 | FLAKY | 20/21 |
test results on build#65790
| test_class | test_method | test_arguments | test_kind | job_url | test_status | passed | reason |
|---|---|---|---|---|---|---|---|
| AvailabilityTests | test_recovery_after_catastrophic_failure | ducktape | https://buildkite.com/redpanda/redpanda/builds/65790#0196bbeb-0108-435b-82e8-a8528e9952bc | FLAKY | 20/21 | upstream reliability is '99.16666666666667'. current run reliability is '95.23809523809523'. drift is 3.92857 and the allowed drift is set to 50. The test should PASS | |
| DatalakeTransactionTests | test_with_transactions | {"cloud_storage_type": 1, "compaction": false} | ducktape | https://buildkite.com/redpanda/redpanda/builds/65790#0196bbeb-0108-4f0d-bc8e-a526beeb58be | FLAKY | 20/21 | upstream reliability is '100.0'. current run reliability is '95.23809523809523'. drift is 4.7619 and the allowed drift is set to 50. The test should PASS |
| PartitionBalancerTest | test_fuzz_admin_ops | ducktape | https://buildkite.com/redpanda/redpanda/builds/65790#0196bbe1-1c0f-4d1a-81bc-90be291f327f | FLAKY | 18/21 | upstream reliability is '91.43968871595331'. current run reliability is '85.71428571428571'. drift is 5.7254 and the allowed drift is set to 50. The test should PASS | |
| RandomNodeOperationsTest | test_node_operations | {"cloud_storage_type": 1, "enable_failures": false, "mixed_versions": false, "with_chunked_compaction": true, "with_iceberg": true, "with_tiered_storage": false} | ducktape | https://buildkite.com/redpanda/redpanda/builds/65790#0196bbe1-1c0e-41e0-941b-68d342181163 | FLAKY | 20/21 | upstream reliability is '98.10426540284361'. current run reliability is '95.23809523809523'. drift is 2.86617 and the allowed drift is set to 50. The test should PASS |
test results on build#67813
| test_class | test_method | test_arguments | test_kind | job_url | test_status | passed | reason |
|---|---|---|---|---|---|---|---|
| DatalakeDelayedEnablementTest | test_enabling_iceberg_in_existing_cluster | {"catalog_type": "rest_jdbc", "cloud_storage_type": 1} | ducktape | https://buildkite.com/redpanda/redpanda/builds/67813#0197a32e-2297-458a-b103-d89d064d5d55 | FLAKY | 12/21 | upstream reliability is '72.1419185282523'. current run reliability is '57.14285714285714'. drift is 14.99906 and the allowed drift is set to 50. The test should PASS |
| NodesDecommissioningTest | test_decommissioning_working_node | {"delete_topic": true, "tick_interval": 3600000} | ducktape | https://buildkite.com/redpanda/redpanda/builds/67813#0197a335-b082-4471-8887-68d5504a778d | FLAKY | 20/21 | upstream reliability is '100.0'. current run reliability is '95.23809523809523'. drift is 4.7619 and the allowed drift is set to 50. The test should PASS |
| NodesDecommissioningTest | test_recycle_all_nodes | {"auto_assign_node_id": false} | ducktape | https://buildkite.com/redpanda/redpanda/builds/67813#0197a335-b081-470f-8ffe-01dc5808f377 | FLAKY | 20/21 | upstream reliability is '100.0'. current run reliability is '95.23809523809523'. drift is 4.7619 and the allowed drift is set to 50. The test should PASS |
| TieredStorageTest | test_tiered_storage | {"cloud_storage_type_and_url_style": [1, "virtual_host"], "test_case": {"name": "(TS_Read == True, TS_TxRangeMaterialized == True, SpilloverManifestUploaded == True)"}} | ducktape | https://buildkite.com/redpanda/redpanda/builds/67813#0197a335-b081-470f-8ffe-01dc5808f377 | FAIL | 0/1 | The test has failed across all retries |
sorry @dotnwat didn't mean to request a review from you :)
I'm taking over responsibility for this PR. We'd like to merge it soon so we can build some Iceberg with JSON schema capabilities on top of it. I've addressed much of the review feedback, and I think the updated PR is merge-worthy, but there are a few other issues that could be addressed. I'll outline them:
- Error handling: The parser does not provide good error messages. It just fails. It's important to improve the error handling so we provide at least a description and position for parsing errors in all cases.
- UTF-8 validation: We do not validate that the text is validly UTF-8 encoded. This is important, and should be straightforward to add.
- Parse any type at top-level: The updated JSON standard allows any type at the top level, but we accept only objects and arrays. In practice top-level values will be arrays or objects but we should try to match the standard.
- Depth limits: We do not bail on deeply nested structures. It's important to add this for safety.
- Number handling: There's a couple of alternate proposals for how to handle number parsing. It'd be nice to simplify, and the current approach has some known issues with high-precision parsing, but the current scheme works pretty well.
- Fuzzing or randomized testing: We should build some randomized testing, at least. Ideally, we would fuzz test the parser.
Retry command for Build#67813
please wait until all jobs are finished before running the slash command
/ci-repeat 1
tests/rptest/tests/tiered_storage_model_test.py::TieredStorageTest.test_tiered_storage@{"cloud_storage_type_and_url_style":[1,"virtual_host"],"test_case":{"name":"(TS_Read == True, TS_TxRangeMaterialized == True, SpilloverManifestUploaded == True)"}}
/ci-repeat 1 tests/rptest/tests/tiered_storage_model_test.py::TieredStorageTest.test_tiered_storage@{"cloud_storage_type_and_url_style":[1,"virtual_host"],"test_case":{"name":"(TS_Read == True, TS_TxRangeMaterialized == True, SpilloverManifestUploaded == True)"}}
Tests all pass, but the test results job is borked, going to force merge.
@rockwotj @wdberkeley snagged these coverage reports and thought you might be interested in them.