Explicitly wrap non-serializable values in json dump
Minor adjustment to environment dumps in the json format style. The change now lets the dump specify if a given value cannot be serialized. On top of just looking cleaner imo, it also reduces the likelyhood of mistaking a non-serialized fallback value for a generic string. I don't think any existing unit tests need to be adjusted/added as a result of this, with the change being fairly self-contained & there not seeming to be anything testing this beforehand.
Before:
{
"ExampleKey": "ExampleValue",
"SomeString": "SomeString"
}
After:
{
"ExampleKey": "<<non-serializable: ExampleValue>>",
"SomeString": "SomeString"
}
Contributor Checklist:
- [ ] I have created a new test or updated the unit tests to cover the new/changed functionality.
- [x] I have updated
CHANGES.txt(and read theREADME.rst) - [x] I have updated the appropriate documentation
Please add a test.
Finally got around to looking at this. I think some kind of indicator is valuable, but I don't necessarily like the current format. We go from a standard Dump output of, for example:
'File': <SCons.Defaults.Variable_Method_Caller object at 0x7fe98c69e710>,
to
"File": "<<non-serializable: Variable_Method_Caller>>",
Seems like a format closer to the Python pprint output might be preferred - leave out the address, which you can't do anything with anyway, but maybe render as '<Variable_Method_Caller object>, with or without the extra tag non-serializable.
Meanwhile - two complaints about the json format output, which are not associated with this PR:
- since the
pprintDumpformat comes out sorted, shouldn't we also sort the json output (sort_keys=True)? - one of the important construction variables,
BUILDERS, collapses down to nothing in the json output, because it's not a plain dictionary:
"BUILDERS": "<<non-serializable: BuilderDict>>",
In the pprint output format, this is expanded like:
'BUILDERS': { 'CFile': <SCons.Builder.CompositeBuilder object at 0x7fe98c3e5710>,
'CXXFile': <SCons.Builder.CompositeBuilder object at 0x7fe98c3e59d0>,
'CopyAs': <SCons.Builder.BuilderBase object at 0x7fe98c3e5250>,
'CopyTo': <SCons.Builder.BuilderBase object at 0x7fe98c3e5190>,
...
It's actually often useful to see what Builders are configured in the current environment (missing builder leads to obtuse error messages, and a Dump() may be helpful to decode that). I wonder if we can come up with some way to handle this case specially.
I'll add json-format irritant #3: the CLVar class has a __str__ method that lets it render its contents, but json serialization doesn't do that either, so we get
"SHCCFLAGS": "<<non-serializable: CLVar>>",
instead of
'SHCCFLAGS': ['$CCFLAGS', '-fPIC'],
That above example just compares pprint to this PR's implementation, the live json output is even less helpful:
"File": "Variable_Method_Caller",
But yes, this PR doesn't "fix" the serialization issue, it just makes it more explicit. I'd absolutely prefer for SCons datatypes to play nicer with serialization natively, as that's be far preferable to a special-case for json dumping, but that's a whole other beast. There's clearly existing discrepancies in how that data is parsed, with pprint gathering more data than json, but even then it's more pprint finding access to more fallbacks
Here's a testcase diff for the current PR implementation:
diff --git a/SCons/EnvironmentTests.py b/SCons/EnvironmentTests.py
index c40439ae0..8a385c1bc 100644
--- a/SCons/EnvironmentTests.py
+++ b/SCons/EnvironmentTests.py
@@ -3177,11 +3177,12 @@ def generate(env):
def test_Dump(self) -> None:
"""Test the Dump() method"""
- env = self.TestEnvironment(FOO = 'foo')
+ env = self.TestEnvironment(FOO='foo', FOOFLAGS=CLVar('--bar --baz'))
assert env.Dump('FOO') == "'foo'", env.Dump('FOO')
assert len(env.Dump()) > 200, env.Dump() # no args version
assert env.Dump('FOO', 'json') == '"foo"' # JSON key version
+ self.assertEqual(env.Dump('FOOFLAGS', 'json'), '"<<non-serializable: CLVar>>"')
import json
env_dict = json.loads(env.Dump(format = 'json'))
assert env_dict['FOO'] == 'foo' # full JSON version
That above example just compares
pprintto this PR's implementation, the livejsonoutput is even less helpful:"File": "Variable_Method_Caller",But yes, this PR doesn't "fix" the serialization issue, it just makes it more explicit. I'd absolutely prefer for SCons datatypes to play nicer with serialization natively, as that's be far preferable to a special-case for
jsondumping, but that's a whole other beast. There's clearly existing discrepancies in how that data is parsed, withpprintgathering more data thanjson, but even then it's morepprintfinding access to more fallbacks
yes, I was agreeing that more explicit is better, because as-is things just get swallowed into something confusing. The odd thing about this output is that it's it's not really expected to be consumed - we'll never json.load it (except I see the unit test actually tries that for a very limited case). so there's a stange tension between "what is useful to a human reader" and "what is standard-for-json".
Today I Learned: the Python stdlib json encoder is actually designed to be extensible, so we could easily improve the presentation of the two problems I mention above, BuilderDict and CLVar. Docs:
To use a custom
JSONEncodersubclass (e.g. one that overrides thedefault()method to serialize additional types), specify it with theclskwarg; otherwiseJSONEncoderis used.
And in the library module, the default method has this docstring:
def default(self, o):
"""Implement this method in a subclass such that it returns
a serializable object for ``o``, or calls the base implementation
(to raise a ``TypeError``).
For example, to support arbitrary iterators, you could
implement default like this::
def default(self, o):
try:
iterable = iter(o)
except TypeError:
pass
else:
return list(iterable)
# Let the base class default method raise the TypeError
return JSONEncoder.default(self, o)
"""
If @bdbaddog is okay with the change as currently constituted, I'll move this stuff to an issue (enhancement request) to keep track of it.
@mwichmann -sounds good. I'll go ahead and merge this one then.