Use orjson for json_dumps
What type of PR is this?
- [x] Refactor
- [x] Feature
- [ ] Bug Fix
- [ ] New Query Runner (Data Source)
- [ ] New Alert Destination
- [ ] Other
Description
Following the discussion in https://github.com/getredash/redash/pull/7339#issuecomment-2684176762, I updated utils.json_dumps to use orjson for improved serialization performance.
Key implementation details:
-
orjson 3.10.15 is used, as it's the last version compatible with Python 3.8.
- orjson 3.10.16 is currently out but it stops Python 3.8 support.
-
Pre-processing: Before calling
orjson.dumps, data is pre-processed recursively using the existing customJSONEncoderto maintain compatibility with Redash's current serialization specifications.- For instance,
datetimeserialization differs:- With the custom
JSONEncoder:{"time": "2024-03-01T15:30:45.123"} - With
orjson:{"time": "2024-03-01T15:30:45.123456"}
- With the custom
- Note: Unlike the standard
jsonmodule,orjsondoes not allow overriding serialization behavior for supported native types. The provideddefaultfunction isn't called for these built-in supported types.
- For instance,
-
Option Mapping:
- Default options are set to
orjson.OPT_NON_STR_KEYS | orjson.OPT_UTC_Z, aligning withensure_ascii=Falsebehavior. - The
sort_keysparameter maps directly toOPT_SORT_KEYS.
- Default options are set to
-
Testing: Added pytest cases to validate behavior aligned with the existing
JSONEncoderspecifications.
How is this tested?
- [X] Unit tests (pytest, jest)
- [ ] E2E Tests (Cypress)
- [X] Manually
- [ ] N/A
For Athena and Trino, the result of select 1.0, cast('NaN' as double), cast('Infinity' as double), cast('-Infinity' as double) is 1.0, null, null, null as expected.
Related Tickets & Documents
- Issue: https://github.com/getredash/redash/issues/6992
- Previous PRs:
- https://github.com/getredash/redash/pull/7339
- https://github.com/getredash/redash/pull/7348
Basic test works
SELECT 'NaN'::float AS not_a_number, 'Inf'::float AS inf, now() AS date
(venv) ~/project/private/redash git:[master]
ruff check --fix tests/test_utils.py
warning: The top-level linter settings are deprecated in favour of their counterparts in the `lint` section. Please update the following options in `pyproject.toml`:
- 'ignore' -> 'lint.ignore'
- 'select' -> 'lint.select'
- 'mccabe' -> 'lint.mccabe'
- 'per-file-ignores' -> 'lint.per-file-ignores'
All checks passed!
(venv) ~/project/private/redash git:[master]
ruff --version
ruff 0.11.2
Is there a difference in the way key-value values are formatted?
- Options: {"dbname":"testdb1","host":"example.com"}
+ Options: {"dbname": "testdb1", "host": "example.com"}
https://github.com/getredash/redash/actions/runs/14329251848/job/40418005004?pr=7391
@eradman Yes. As I commented, orjson always uses compact separators. I'm going to fix the test.
Passed all failed tests.