Dependency Management - JSON
At the moment, we depend on 4 different json libraries (including inner dependencies).
- json-simple
- json.org
- Gson
- Jackson
for the sake of less dependencies, we should consider removing most.
the top candidates are Jackson and Gson, as Jackson is heavily used by tinkerpop, and Gson is heavily used by Jest & Hadoop. requires further analysis of which API is easier to use and what performs better.
- Tinkerpop
- Jest
- Hadoop
either way, this required some refactor to remove the other libraries.
Doesn't Tinkerpop use Jackson?
True, yet gremlin-spark uses gson.
From: 'Google, Inc.' (http://www.google.com)
- Gson (http://code.google.com/p/google-gson/) com.google.code.gson:gson:jar:2.2.4
License: The Apache Software License, Version 2.0 (http://www.apache.org/licenses/LICENSE-2.0.txt)
Also, when testing ETL processes with gson VS jackson, jackson seems to perform better, possibly we should consider using it over Gson.
@rmagen ammended issue according to comment.
I ran into a problem with Gson - it deserializes Integers as Doubles. See http://stackoverflow.com/a/17090933/3592312. In the mean time I'm using using Jackson to parse the json, as suggested in the link, at least until we delve deeper into this issue.