django-rest-framework-xml
django-rest-framework-xml copied to clipboard
Parse zero padded integers as strings
Thanks for your work in this library! We're considering using it for a project, but one of the things that we run in to is that our users have values in their XML that look (and parse) like integers, but are actually strings. These values look like <code>000123</code>
.
The parser incorrectly assumes these are integers, and in the process throws away information (the leading zeroes). The renderer works correctly though: the Python-structure {'code': '000123'}
is rendered correctly to <code>000123</code>
. So for one, the symmetry between parser and renderer is missing here.
This PR checks if values have a leading zero, and then does no processing at all. In the process I added some more tests to clarify existing behavior.
Checks have failed due to ERROR: InterpreterNotFound: python3.3
. I don't think there's anything wrong with my own tests.
ping @jpadilla
Hey @gertjanol thanks for putting this together. I'm kind of torn and not sure what the right solution would be. This as is would be a breaking change right now. Also, it would now parse "0"
as a string instead of an int, right?
At this point I'm inclining towards perhaps making a note of the existing behavior in the docs and just leaving it to users to implement their expected type conversions.
Myeah, you're right about this being a breaking change. I'm not sure how big of an issue that is though :).
The issue with '0
being parsed as a string is a valid one indeed. We could fix that by checking if the entire string is made up of 0
's, for instance. And in that case, cast it to an int.
How do you envision users implementing their own type conversion? Subclass this Parser? def _type_convert
is private/protected, so shouldn't be overriden (by convention).
@gertjanol I'd rather not make this type of assumption going on forward and instead enable users to override the parser however they see fit.
I'm actually inclined on just adding/documenting the necessary public API methods, making zero assumptions the current _type_convert()
method. Similar to how I think FormParser
must behave, and release new major version.
Hey @jpadilla. Alright, sounds like a plan! So then the forced cast to int
would also go, or do you want to keep that in there to retain backwards compatibility?
Another solution I thought of, that might be the most simple, is adding a setting for this specific use case. Something like this
# Should data like <code>000123</code> be treated as strings (thereby keeping the leading zeroes), or be cast to an integer (thereby discarding the zeroes).
XML_KEEP_LEADING_ZEROES = False
Default would be False
so current behavior doesn't change.
Drawback is that this might lead to other settings for every quirk that you can think of :).
@gertjanol that's right, by default this new method will just return whatever the original value was. In the release note we would have to document what the original behavior was so people depending on it can just copy/paste a parser with that behavior.
See #26