superset icon indicating copy to clipboard operation
superset copied to clipboard

refactor: Unify all json.(loads|dumps) usage to utils.json

Open eyalezer opened this issue 1 year ago • 2 comments

SUMMARY

Second phase of the json migration to use the new utils.json module

After completing the initial phase of creating the utils.json module as mentioned in the following link: https://github.com/apache/superset/pull/28522, we are now moving on to the second phase. This phase involves consolidating all json usage and transitioning to the utilization of the newly created module.

During this phase:

  • Refactored all instances where json was being used and updated the references to utilize the json utils module.
  • Made necessary additions and fixes to the tests to ensure their compatibility with the changes made.

eyalezer avatar May 24 '24 16:05 eyalezer

Codecov Report

Attention: Patch coverage is 68.04734% with 54 lines in your changes missing coverage. Please review.

Project coverage is 83.47%. Comparing base (76d897e) to head (774d3d1). Report is 1094 commits behind head on master.

Files with missing lines Patch % Lines
superset/extensions/pylint.py 0.00% 23 Missing :warning:
superset/commands/dataset/export.py 25.00% 3 Missing :warning:
superset/commands/dashboard/export.py 33.33% 2 Missing :warning:
superset/views/chart/views.py 33.33% 2 Missing :warning:
superset/charts/data/api.py 75.00% 1 Missing :warning:
superset/commands/chart/export.py 50.00% 1 Missing :warning:
superset/commands/chart/importers/v1/utils.py 50.00% 1 Missing :warning:
superset/commands/database/export.py 50.00% 1 Missing :warning:
superset/commands/database/validate.py 50.00% 1 Missing :warning:
superset/commands/query/export.py 50.00% 1 Missing :warning:
... and 18 more
Additional details and impacted files
@@             Coverage Diff             @@
##           master   #28702       +/-   ##
===========================================
+ Coverage   60.48%   83.47%   +22.98%     
===========================================
  Files        1931      523     -1408     
  Lines       76236    37575    -38661     
  Branches     8568        0     -8568     
===========================================
- Hits        46114    31365    -14749     
+ Misses      28017     6210    -21807     
+ Partials     2105        0     -2105     
Flag Coverage Δ
hive 49.01% <56.21%> (-0.16%) :arrow_down:
javascript ?
mysql 77.12% <66.27%> (?)
postgres 77.23% <66.86%> (?)
presto 53.56% <58.57%> (-0.25%) :arrow_down:
python 83.47% <68.04%> (+19.98%) :arrow_up:
sqlite 76.68% <66.86%> (?)
unit 58.94% <53.84%> (+1.32%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar May 24 '24 16:05 codecov[bot]

@mistercrunch - here's the second part of the refactor, as expected this is a huge PR: 232 files changed

eyalezer avatar May 24 '24 21:05 eyalezer

OMG glad to see this, happy to help fast-merge this since it'll conflict with everything else otherwise.

Oh as a follow up, or maybe something we may want to bundle here -> @betodealmeida mentioned that there's a fairly easy way for us to add a linting rule that prevents people from doing simple import json, and force going through the wrappers in superset/utils/json.py

mistercrunch avatar May 28 '24 16:05 mistercrunch

i'll rebase it quickly so it won't catch up more conflicts...

regarding the linter, it's a great idea and it looks like it should be plausible by adding a custom mypy plugin for example... but there's more research needed to be done here.

eyalezer avatar May 28 '24 17:05 eyalezer

@eyalezer I have this PR out, it's a pylint rule: https://github.com/apache/superset/pull/26803

We could do something similar for json.

betodealmeida avatar May 28 '24 17:05 betodealmeida

@betodealmeida - awesome, so it's even easier than i thought... i'll look into it now and test it

eyalezer avatar May 28 '24 17:05 eyalezer

@mistercrunch - Rebased before it's too late @betodealmeida - Thanks for the reference

  • Added another commit with the pylint rule to lint any "import simple/json" - tested and working as expected

eyalezer avatar May 28 '24 19:05 eyalezer

Amazing, this is a massive refactor that should make everything json-related much more manageable. Interestingly python's standard lib json IS simplejson (see here https://stackoverflow.com/questions/712791/what-are-the-differences-between-json-and-simplejson-python-modules), but simplejson is typically ahead. Also having json all in one place allows us to consider things like https://pypi.org/project/ujson/ and do things like what triggered this refactor (improve utf-8 support + error handling) centrally.

mistercrunch avatar May 28 '24 21:05 mistercrunch

It's interesting that you brought it up. After I finished refactoring all the json.(loads|dumps) to utilize the json module, one of the first things I did was to check if ujson actually provides any significant performance enhancements to superset. it seems to be functioning fine, but I must admit that I haven't thoroughly tested it to accurately measure the extent of its performance improvements. Nevertheless, if anyone is interested in giving it a try, I still have the branch available here: https://github.com/eyalezer/superset/tree/ujson.

eyalezer avatar May 29 '24 03:05 eyalezer

Hey @eyalezer this is great. If you haven't, feel free to join the Apache Superset Slack. I'd be happy to help if you wish to contribute more to the project!

geido avatar May 29 '24 12:05 geido