jason icon indicating copy to clipboard operation
jason copied to clipboard

Streaming support

Open liveforeverx opened this issue 7 years ago • 7 comments

Are there any plans for having streaming support?

liveforeverx avatar Feb 13 '18 13:02 liveforeverx

What exactly do you mean by streaming?

michalmuskala avatar Feb 14 '18 11:02 michalmuskala

I think they mean SAX-style support for reading maybe? I guess?

OvermindDL1 avatar Feb 14 '18 16:02 OvermindDL1

@michalmuskala Like this: https://github.com/talentdeficit/jsx#incomplete-input

Something, what allows to decode json partially on the fly without needing to get the whole json.

liveforeverx avatar Feb 16 '18 11:02 liveforeverx

Ah. Thanks. I think such a feature indeed makes sense.

I even did some work on implementing it in https://github.com/michalmuskala/jason/pull/3, but ultimately abandoned it before 1.0 due to complexity, but I think it makes sense to implement it now. This should also allow decoding directly from iodata without converting to a single string first, which might also help with performance in some cases.

The only problem with streaming decoding is handling of numbers - you never know when they end, but I think it will be reasonable to require top-level values to be either strings, objects or arrays.

michalmuskala avatar Feb 16 '18 11:02 michalmuskala

Also, how about streaming encoding, when your top-level input term can be guaranteed to be Enumerable?

Basically, an API function to do this (with perhaps lower overhead):

Stream.concat([
  ["[\n"],
  my_huge_stream |> Stream.map(&Jason.encode_to_iodata!) |> Stream.intersperse(",\n"),
  ["]"]
])
|> Enum.into(File.stream!("foo"))

tsutsu avatar Mar 15 '18 21:03 tsutsu

Any news on this? I'm willing to pitch in if it helps move things on, but I've no idea where to start.

jjl avatar May 17 '18 18:05 jjl

As mentioned in #34 if your JSON is line-delimited (and in most "streaming" cases it actually is), the problem is trivial - split on newlines and feed each part separately to Jason.decode/2.

Full streaming implementation is going to be rather complex and will require significant changes to the parser - the simplest approach would be to return a continuation at each point in the parser where it errors with the "unexpected eof" error. Later the user could call the continuation with more data.

There was some initial work in https://github.com/michalmuskala/jason/pull/3, but I abandoned it in favour of releasing 1.0 faster. Unfortunately, I probably won't have time to work on this soon.

michalmuskala avatar May 18 '18 11:05 michalmuskala