format_parser icon indicating copy to clipboard operation
format_parser copied to clipboard

Return a more uniform structure in as_json

Open linkyndy opened this issue 4 years ago • 4 comments

At the moment, results are returned as a regular hash, with keys and values containing a mix of strings and symbols:

{"nature"=>:archive, "intrinsics"=>nil, "entries"=>[{:type=>:file, :size=>53335, :filename=>"foo"}, {:type=>:file, :size=>292210, :filename=>"bar"}, {:type=>:file, :size=>1965044, :filename=>"baz"}], "format"=>:zip}

It happened to me several times that I was getting the key or comparing the value using the wrong type. I think normalising the return value of as_json would be a nice improvement.

Currently, I am doing this to normalise the result hash:

JSON.load(JSON.dump(result))

Here are the options I see:

  • use strings or symbols for all key/values (since this is meant to serialise to JSON at some point, it might make sense to stick to strings)
  • use a hash with indifferent access; however, that likely adds a dependency and diverts from the idea of having a very strict contract and the least number of dependencies

linkyndy avatar May 13 '20 11:05 linkyndy

After trying yo use the new version with stringify_keys:true in our service, we realized it wasn't enough 😞 .

When the hash has non-strings as values, it raises errors like the following one, when trying to serialize it (when saving to Dynamodb):

Errors > JSON::GeneratorError#15
source sequence is illegal/malformed utf-8

So, the result of as_json should be ready to be serialized to JSON, with keys and values converted to String. This is how activemodel#as_json works.

A possible solution is having an option stringify_keys_and_values: true, which is not a good name (too long) or something similar, to be backwards compatible.

wdyt @linkyndy ?

fabioperrella avatar Jul 17 '20 13:07 fabioperrella

I don't believe that error is raised because there are symbols used in the hash. But I didn't dig into it, so you may be right.

One easy fix is to make the method return strings only, and bump the version to a major one 🙂 . Or, like you said, you can use an option to keep it backwards compatible (use_strings?).

linkyndy avatar Jul 20 '20 11:07 linkyndy

I always prefer to keep compatible when possible! 😄

use_strings seems good for me!

So, I will open another PR here and close https://github.com/WeTransfer/peek/pull/127

fabioperrella avatar Jul 20 '20 14:07 fabioperrella

I realized the problem with JSON.dump(result) (source sequence is illegal/malformed utf-8) happens when the result hash contains non utf-8 strings inside.

For example, when the filename is something like "foo\xA9s"

fabioperrella avatar Jul 21 '20 19:07 fabioperrella