Document how to configure `Oj.dump` to make it compatible with `Oj::Parser.usual`

Open gschlager opened this issue 1 year ago • 1 comments

By default, Oj.dump outputs JSON that can't be read by Oj::Parser.usual. For example, with the default settings, hashes with symbol keys and classes aren't parsed correctly.

Hashes

json = Oj.dump({a: 1, b: 2})
Oj::Parser.usual.parse(json)

Expected result

{:a=>1, :b=>2}

Actual result

{":a"=>1, ":b"=>2}

Objects

class Foo
  attr_accessor :name

  def initialize(name = nil)
    @name = name
  end
end

json = Oj.dump(Foo.new("bar"))
Oj::Parser.usual.parse(json)

Expected result

#<Foo:0x00007055bd5d40e8 @name="bar">

Actual result

{"^o"=>"Foo", "name"=>"bar"}

Could we document the best settings so that hashes and classes work? Or maybe even implement a mode and preconfigured parser for it?

The following settings made it work in my tests, but I'm not sure if that's the recommended solution.

Oj.default_options = { mode: :custom, create_id: "^", create_additions: true }

parser = Oj::Parser.new(:usual, cache_keys: true, symbol_keys: true, create_id: "^")

BTW: I tried configuring "^o" as create_id for the parser, but I'm getting an error.

irb(main):039> Oj::Parser.new(:usual, create_id: "^o")
(irb):39:in `new': The create_id values is limited to 2 bytes. (ArgumentError)

irb(main):040> "^o".bytesize
=> 2

Is this a bug, is there an off-by-one error in the error message, or am I doing something wrong?

Jul 31 '24 11:07 gschlager

Oj has been around for a long time. The first mode it had was the object mode. This allowed objects to be dumped and loaded. It was and still is the default mode mostly due to not wanting to break the code of early adopters. The page that describes the format used for the object mode is: https://github.com/ohler55/oj/blob/develop/pages/Encoding.md. This mode is supported by the Oj.load and Oj.parse functions which are also part of the original behavior.

Fast forward to the newer Oj::Parser which was developed to provide not only better performance but more importantly a means of isolating on set of parse configurations from all others so as to avoid issues with conflicting options as the default and specific calls. Since the lack of isolation revolved around the use of Rails and the JSON gem I did not implement a delegate for the original object mode.

With that history in mind, calling Oj.dump without changing the mode will use object mode for encoding. The result can be used to reconstitute objects with the Oj.xxx calls but not with the Oj::Parser. As you discovered the other option is to use the create_id. (You did uncover a bug there. The calculation for the limit of the create_id length is off. My tests only used a one character create_id so I didn't catch it.)

Jul 31 '24 15:07 ohler55