conduit Performance issue with lifted version of `Control.Monad.Reader.local`?

There is a significant difference in time and space usage when running code that uses local in ConduitM _ _ (ReaderT r m) vs ReaderT r m.

Steps to reproduce:

Clone https://github.com/sol/conduit-issue and build it with -O0
Run reader +RTS -s -K1K -RTS plain for reference and see it succeed
Run reader +RTS -s -K1K -RTS conduit and see it fail

reader +RTS -hT -i0.05 -RTS conduit && hp2ps -c reader.hp && convert -rotate -90 reader.ps reader.png gives some insight into how memory is used:

reader

I discovered this while working on https://github.com/snoyberg/yaml/pull/148. I'm not sure whether this is something that is inherent or if it can be fixed.

I'm also not sure whether it's worth tackling at all.

Aug 28 '18 08:08 sol

CCing @ndmitchell

Aug 28 '18 10:08 sol

It doesn't mean much to look at these stats with optimizations disabled. What happens if you turn on -O1?

Aug 28 '18 13:08 snoyberg

With -O1 the program runs faster, this is why I changed the sampling interval to -i0.03: reader

Still has the final spike in stack usage.

Aug 28 '18 14:08 sol

BTW, this is what I see if I change the final ask to return () (read: we never look at the reader value):

reader

Aug 28 '18 14:08 sol

memory usage, not stack usage.

I don't view that spike at the end as particularly significant - if you removed it you save 20% max memory? The green THUNK at the start is what I'd target, as it feels wrong - like a space leak, but perhaps not one of the ones that gets caught by the -K trick.

Aug 28 '18 14:08 ndmitchell

I don't view that spike at the end as particularly significant

Hmm, with spike I was really only referring to the spike in stack usage towards the end (the light blue segment in the -O1 case), which I assume is due evaluating that purple THUNK_2_0 thingy.

Stack usage peaks around 150M, which can also be confirmed with:

$ reader +RTS -s -K140M -RTS conduit -- fails
$ reader +RTS -s -K150M -RTS conduit -- succeeds

So my comment was really just about the symptom (high stack usage), rather than the underlying problem (thunking).

The green THUNK at the start is what I'd target, as it feels wrong

I will probably not have much time to dig much deeper into this. I'm also still not sure how relevant this whole thing is. Yes, it feels wrong and it had a measurable impact on how I used it in yaml. But, I don't know how efficient it can be made; and I don't know how much this matters in practice (how often do people actually do recursive calls to local)?

What I do know is, the non-conduit version is much much more efficient, especially with -O1
This is the code that produces the thunks: https://github.com/snoyberg/conduit/blob/conduit-1.3.0.3/conduit/src/Data/Conduit/Internal/Conduit.hs#L163-L169

perhaps not one of the ones that gets caught by the -K trick

I was naively assuming that it does. Stack usage seems to be in the couple of megabytes during the whole run of the program. Or do you mean that the high stack usage is not related to thunk evaluation, but something else?

Aug 28 '18 15:08 sol

Call you try liberally sprinkling INLINEs to the typeclass instances?

On Tue, Aug 28, 2018, 6:52 PM Simon Hengel [email protected] wrote:

I don't view that spike at the end as particularly significant

Hmm, with spike I was really only referring to the spike in stack usage towards the end (the light blue segment in the -O1 case), which I assume is due evaluating that purple THUNK_2_0 thingy.

Stack usage peaks around 150M, which can also be confirmed with:

$ reader +RTS -s -K140M -RTS conduit -- fails $ reader +RTS -s -K150M -RTS conduit -- succeeds

So my comment was really just about the symptom (high stack usage), rather than the underlying problem (thunking).

The green THUNK at the start is what I'd target, as it feels wrong

I will probably not have much time to dig much deeper into this. I'm also not sure how relevant this whole thing is. Yes, it feels wrong and it had a measurable impact on how I used it in yaml. But, I don't know how efficient it can be made; and I don't know how much this matters in practice (how often do people actually do recursive calls to local)?

What I do know is, the non-conduit version is much much more efficient

This is the code that produces the thunks: https://github.com/snoyberg/conduit/blob/conduit-1.3.0.3/conduit/src/Data/Conduit/Internal/Conduit.hs#L163-L169

perhaps not one of the ones that gets caught by the -K trick

I was naively assuming that it does. Stack usage seems to be in the couple of megabytes during the whole run of the program. Or do you mean that the high stack usage is not related to thunk evaluation, but something else?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/snoyberg/conduit/issues/388#issuecomment-416638719, or mute the thread https://github.com/notifications/unsubscribe-auth/AADBB26yMw5V2SbJDLEouAG4m55lRl_dks5uVWdUgaJpZM4WPJNo .

Aug 28 '18 15:08 snoyberg

Stack usage of 150Mb is really nuts. Afraid I have no idea what's going on though...

Aug 30 '18 22:08 ndmitchell

conduit conduit copied to clipboard

Performance issue with lifted version of `Control.Monad.Reader.local`?

conduit
conduit copied to clipboard