aeson
aeson copied to clipboard
Rework GFromJSON instances to enable elimination of Generics
The Parser monad signature makes inlining very difficult because GHC only inlines saturated applications, which is hard to achieve with Parser.
We work around this by defining an alternate method in GFromJSON which manipulates the semantically equivalent IResult monad (Parser is technically more expressive but not for the terms appearing in GFromJSON). The existing method is left for API compatibility.
The resulting code achieves rougly the same compilation speed but reaches large performance improvements, for a generated code size not far from the manually written instances in benchmarks.
The object file for Twitter.Generic grows to 332kB (from 285kB), but the Twitter.Manual module is 300kB large. The inlining level of typeMismatch has been reduced to curb code blow-up.
Due to inlining levels differences, the Generic fromJSON is now often quicker than its TH or Manual counterpart in benchmarks.
Benchmark results (Core i5 Haswell)
| TH | Generic (before) | Generic (after) | Manual | |
|---|---|---|---|---|
| D/fromJSON | 1.720 µs | 5.076 µs | 1.505 µs | - |
| BigRecord/fromJSON | 3.720 µs | 8.450 µs | 2.964 µs | - |
| BigProduct/fromJSON | 2.412 µs | 3.427 µs | 1.812 µs | - |
| BigSum/fromJSON | 173 ns | 1230 ns | 171 ns | - |
| twitter100 | 1.611 ms | 2.063 ms | 1.334 ms | 1.516 ms |
| jp100 | 1.844 ms | 2.18 ms | 1.481 ms | 1.677 ms |
That's very cool!
It's unfortunate that gParseJSON is exported in the toplevel modules Data.Aeson/Data.Aeson.Types. Users should really not depend on these Generics internals. It would be a good idea to hide this at some point, if not now, so we can just remove gParseJSON.
What is the process for contributions ?
Just wait for bergmark to look at the PR.
Thanks for the contribution! And sorry for the delay. I'll try to review this and #652 tonight.
This looks great to me! As for #652 I was surprised that it didn't make compilation speed a lot worse. If someone else could do a review as well, I'd appreciate it.
It may be that this will cause issues with older GHCs, as with #652, but if so we can consider dropping some support.
Re gparseJson being exported - I wonder if anyone uses it? I can create a separate ticket for that as this change doesn't seem to require a major bump.
This patch may increase compilation time of records on GHC 7.10 and 8.0 (see #652)
This may be alleviated by adding {-# NOINLINE gParseJSON #-} whenever gParseJSON' is already defined (since the other one becomes unused). Tests on Text.Pandoc.Definition from pandoc-types show 500MB of resident memory reduction out of 1.5-2GB.
I'm not sure I understand what you mean by "whenever gParseJSON is already defined", Is that something we could do as a part of this PR?
The result of Generics here seem even better than the TH ones. What is blocking this PR?
Sorry for the delay. I had a question in my last comment, and i just internalized this PR as "needs more work" but really I think we should merge this. I'll try to look into the merge conflict some time unless someone beats me to it.
This conflicts, and I'm worried that class now has two-fields. It's most likely affects optimisation as it's no more newtype-like.
After a bit of thought, let us leave this from 1.5 (It's indeed could be fixed in minor version, as we don't export internals).
I'm worried that class now has two-fields. It's most likely affects optimisation as it's no more newtype-like.
The OP demonstrates a significant runtime performance increase though.
This PR isn't very complicated, might be a good idea to solve conflicts and revisit it. I might do that at some point (and simultaneously look into compile time performance of generic deriving of instances too as it's currently slow).