JsonSurfer icon indicating copy to clipboard operation
JsonSurfer copied to clipboard

GsonParser converting longs to doubles

Open cmuchinsky opened this issue 6 years ago • 9 comments

The GsonParser is converting longs to doubles within the numberHolder implementation. Instead of calling return jsonProvider.primitive(jsonReader.nextString()); perhaps something like this would work better:

final String value = jsonReader.nextString();
try {
    return jsonProvider.primitive(Long.parseLong(value));
}
catch (final NumberFormatException e) {
    return jsonProvider.primitive(Double.parseDouble(value));
}

cmuchinsky avatar Sep 07 '18 15:09 cmuchinsky

Did you test it? Are sure no any exception would be thrown when calling nextString() following "NUMBER" token?

wanglingsong avatar Sep 08 '18 06:09 wanglingsong

Yes, using the attached snippet, it worked as expected. Per the JsonReader.nextString javadoc: If the next token is a number, this method will return its string form

cmuchinsky avatar Sep 10 '18 09:09 cmuchinsky

I think it would introduce too much overhead for the potential two more parsing. If you really need long type, I think you can implement a custom JsonProvider.

wanglingsong avatar Sep 11 '18 07:09 wanglingsong

Doing this at the provider level could work, however I believe the core issue is in the parser as that's where its forcing the long into a double via the call to jsonReader.nextDouble(). By the time it gets to the provider its already been turned into a double. The JsonReader class does internally keep track of whether its a long or double, but unfortunately it doesn't make that information available to public consumers. I will check if its possible to extend JsonReader to gain access to the peeked member, which if set to 15 indicates its a long vs a double.

cmuchinsky avatar Sep 11 '18 15:09 cmuchinsky

Unfortunately it looks like JsonReader::peeked is package scoped and not protected

cmuchinsky avatar Sep 11 '18 15:09 cmuchinsky

Actually, I'm curious about your use case? What kind of benefit can you gain from such conversion?

wanglingsong avatar Sep 11 '18 15:09 wanglingsong

The use case is that the json we parse and filter needs to retain its original formatting so that when we do schema inference it doesn't change types from a long to a double.

cmuchinsky avatar Sep 11 '18 15:09 cmuchinsky

So due to such a limitation of Gson, maybe you can try other JsonSurfer implementation, e.g. JacksonSurfer

wanglingsong avatar Sep 11 '18 15:09 wanglingsong

Will give it a look, ideally I want an implementation that I can use in a streaming read and provider scenario. As the data is read and filtered with json path, the output is then fed to a provider that is simply streaming out the other side, that way if I hit a massive json document with a json path like $.* it wouldn't blow up trying to assemble the entire document in memory.

cmuchinsky avatar Sep 11 '18 16:09 cmuchinsky