jackson-databind icon indicating copy to clipboard operation
jackson-databind copied to clipboard

Suggestion: Support value deduplication for enumeration like values

Open CodingFabian opened this issue 3 years ago • 5 comments

There are many use cases where the json to be deserialized contains pseudo enumeration values as string. Take a look at these examples

{ "type": "cat", "name": "Toby"} 

or

{ "categories": ["food", "vegetables", "green"], "name": "beans" }

if you build anything that wants to keep the deserialized Java objects in memory, you end up keeping multiple copies of the same string in memory. What you could then manually do is to deduplicate the values after you have deserialized the object:

  private static final ConcurrentHashMap<String, String> values = new ConcurrentHashMap<>();

  public static String deduplicateValue(String value) {
    return technologies.computeIfAbsent(value, v -> v);
  }
  
  public static deduplicateAnimal(Animal a) {
    a.setType(deduplicateValue(a.getType));
  }

it would be really sweet if Jackson would have an ability to be told that specific fields contain a (limited) number of enumeration like values and it would then deduplicate the string automatically.

We have processes where this saves gigabytes of memory and thus a lot of money too.

Alternative solution would be to use StringDeduplication feature of G1GC, but that might not be available. Also it comes with some extra cost.

CodingFabian avatar Jun 22 '21 08:06 CodingFabian