django-rest-framework-xml icon indicating copy to clipboard operation
django-rest-framework-xml copied to clipboard

Parse zero padded integers as strings

Open gertjanol opened this issue 7 years ago • 8 comments

Thanks for your work in this library! We're considering using it for a project, but one of the things that we run in to is that our users have values in their XML that look (and parse) like integers, but are actually strings. These values look like <code>000123</code>. The parser incorrectly assumes these are integers, and in the process throws away information (the leading zeroes). The renderer works correctly though: the Python-structure {'code': '000123'} is rendered correctly to <code>000123</code>. So for one, the symmetry between parser and renderer is missing here.

This PR checks if values have a leading zero, and then does no processing at all. In the process I added some more tests to clarify existing behavior.

gertjanol avatar Nov 27 '17 14:11 gertjanol

Checks have failed due to ERROR: InterpreterNotFound: python3.3. I don't think there's anything wrong with my own tests.

gertjanol avatar Nov 28 '17 08:11 gertjanol

ping @jpadilla

gertjanol avatar Dec 04 '17 08:12 gertjanol

Hey @gertjanol thanks for putting this together. I'm kind of torn and not sure what the right solution would be. This as is would be a breaking change right now. Also, it would now parse "0" as a string instead of an int, right?

At this point I'm inclining towards perhaps making a note of the existing behavior in the docs and just leaving it to users to implement their expected type conversions.

jpadilla avatar Dec 04 '17 14:12 jpadilla

Myeah, you're right about this being a breaking change. I'm not sure how big of an issue that is though :). The issue with '0 being parsed as a string is a valid one indeed. We could fix that by checking if the entire string is made up of 0's, for instance. And in that case, cast it to an int.

How do you envision users implementing their own type conversion? Subclass this Parser? def _type_convert is private/protected, so shouldn't be overriden (by convention).

gertjanol avatar Dec 04 '17 15:12 gertjanol

@gertjanol I'd rather not make this type of assumption going on forward and instead enable users to override the parser however they see fit.

I'm actually inclined on just adding/documenting the necessary public API methods, making zero assumptions the current _type_convert() method. Similar to how I think FormParser must behave, and release new major version.

jpadilla avatar Dec 07 '17 02:12 jpadilla

Hey @jpadilla. Alright, sounds like a plan! So then the forced cast to int would also go, or do you want to keep that in there to retain backwards compatibility?

Another solution I thought of, that might be the most simple, is adding a setting for this specific use case. Something like this

# Should data like <code>000123</code> be treated as strings (thereby keeping the leading zeroes), or be cast to an integer (thereby discarding the zeroes).
XML_KEEP_LEADING_ZEROES = False

Default would be False so current behavior doesn't change.

Drawback is that this might lead to other settings for every quirk that you can think of :).

gertjanol avatar Dec 07 '17 08:12 gertjanol

@gertjanol that's right, by default this new method will just return whatever the original value was. In the release note we would have to document what the original behavior was so people depending on it can just copy/paste a parser with that behavior.

jpadilla avatar Dec 08 '17 02:12 jpadilla

See #26

gertjanol avatar Dec 11 '17 15:12 gertjanol