jason
jason copied to clipboard
Streaming support
Are there any plans for having streaming support?
What exactly do you mean by streaming?
I think they mean SAX-style support for reading maybe? I guess?
@michalmuskala Like this: https://github.com/talentdeficit/jsx#incomplete-input
Something, what allows to decode json partially on the fly without needing to get the whole json.
Ah. Thanks. I think such a feature indeed makes sense.
I even did some work on implementing it in https://github.com/michalmuskala/jason/pull/3, but ultimately abandoned it before 1.0 due to complexity, but I think it makes sense to implement it now. This should also allow decoding directly from iodata without converting to a single string first, which might also help with performance in some cases.
The only problem with streaming decoding is handling of numbers - you never know when they end, but I think it will be reasonable to require top-level values to be either strings, objects or arrays.
Also, how about streaming encoding, when your top-level input term can be guaranteed to be Enumerable
?
Basically, an API function to do this (with perhaps lower overhead):
Stream.concat([
["[\n"],
my_huge_stream |> Stream.map(&Jason.encode_to_iodata!) |> Stream.intersperse(",\n"),
["]"]
])
|> Enum.into(File.stream!("foo"))
Any news on this? I'm willing to pitch in if it helps move things on, but I've no idea where to start.
As mentioned in #34 if your JSON is line-delimited (and in most "streaming" cases it actually is), the problem is trivial - split on newlines and feed each part separately to Jason.decode/2
.
Full streaming implementation is going to be rather complex and will require significant changes to the parser - the simplest approach would be to return a continuation at each point in the parser where it errors with the "unexpected eof" error. Later the user could call the continuation with more data.
There was some initial work in https://github.com/michalmuskala/jason/pull/3, but I abandoned it in favour of releasing 1.0 faster. Unfortunately, I probably won't have time to work on this soon.