libucl
libucl copied to clipboard
Fix excessive escaping when using `ucl_object_fromstring()`
UCL_STRING_ESCAPE
might be useful for something (though I'm not sure what) but it's not good default behaviour. Strings are escaped during output to meet the needs of the output format, so adding escape sequences while building the object results in things being escaped twice.
This is clearly an example of breaking POLA policy. If one wants a behaviour to skip escaping, it is always possible to use ucl_object_fromstring_common
with UCL_STRING_RAW
flag. I don't want to switch the default behaviour of ucl_object_fromstring
as it is clearly a major breaking change.
I think it's far more astonishing that ucl_object_fromstring()
is unsafe to use (I was certainly surprised.) Ultimately it's up to you, but POLA doesn't mean "never fix major bugs."
I understand and share your frustration: the original decision of the default behaviour of ucl_object_fromstring
was definitely not right as it led to double escaping in emitter/parser chains. I would not say that this behaviour is unsafe
, as it is actually too safe
. The reason why I don't like the idea to merge it is that it might break some use cases when an API user expects that all data imported by this function is perfetly JSON safe
. With your change this assumption is broken. And whilst I also think that it is a right thing I don't think that it is a good reason to break the (potentially) existing code after all.
I called it "unsafe" because it destroys the integrity of the data that you hand it. We discovered this behaviour in a part of our program that reads in data from a JSON file, does some processing, then writes out the JSON file. Without knowing the number of times the file has gone through this process we cannot retrieve the original value.
Take as an example:
{
"foo": "\\\\n"
}
This value could be \\n
that has been processed once or \n
that has been processed twice. There's no way to tell just by looking at the data.
It's simple to avoid this behaviour once you know it exists, but I believe that fixing the broken interface is better than leaving it there as a trap for the unwary.
Please don't close this issue, I'm still thinking of a better choice here. I think you are right but I just need to sit and check the use cases more carefully. Thank you!
changing existing API is hard, but perhaps its possible to add an additional API that addresses this.
I think this should be really done