conduit
conduit copied to clipboard
Subtle cleanup issue with zipConduitApp
My understanding is that a Pipe
should not generally be abandoned except on Done result
or by running the final
cleanup associated with a HaveOutput pipe final output
.
Part of the tricky bit of the zipConduitApp
combinator is combining two pipes into one in such a way that this is maintained. That is, maintaining the invariant that, if the user does not abandon the combined pipe except as a above, then neither of the two combined pipes will be abandoned except as above also.
As implemented, this is done by keeping track of the last final
operation associated with each of the pipes, respectively finalX
and finalY
, and providing a combined final = finalX >> finalY
operation for each HaveOutput
generated by the combined pipe.
The problem I see with this is finalX
and finalY
are return ()
until the respective pipes generate a HaveOutput
. This will cause problems for some pipes. For example, imagine the right pipe does the following
- opens a file (
PipeM
), - yields some output (
HaveOutput
with a close the file finalizer), - ...
while the left pipe does the following
- yields some output (
HaveOutput
with a do nothing finalizer), - ...
The combined pipe will then
- opens the right right pipe's file (
PipeM
), - yield the left pipe's output (
HaveOutput
with a do nothing finalizer), - ...
The issue here is if the downstream pipe chooses to abort at (2) then the right pipe will not get its cleanup done. This actually extends past the initial condition being set as well. For example, it is possible to imagine a pipe doing a HaveOutput
followed by a PipeM
operation that acquires further resources and thus invalidates the final
operation associated with the earlier HaveOutput
.
The only solution I can see is that the combined pipe cannot pass along a HaveOutput
(give the user an option to abort with a combined final
) unless the combined pipes both have not been allowed to make further progress themselves (i.e., their last action was HaveOutput
or Done
).
Cheers! -Tyson
PS: I believe there are also issues with the way the Done
state is handled. A combined pipe going into the Done
state does not currently drop the last HaveOutput
final
operation associated with it.
If a user then chooses to abort on a combined Output
with one of the combined pipes being in the Done
state, the combined final
operation will run the final
associated with the Done
pipe's last HaveOutput
. This could leading to such things as trying to close the same file twice.
Can you provide a test case demonstrating the problem?
On Fri, Jan 27, 2017 at 7:22 PM, Tyson Whitehead [email protected] wrote:
My understanding is that a Pipe http:///snoyberg/conduit/blob/be803218b5b2acaad2eb720ca3a27a4d0734fba8/conduit/Data/Conduit/Internal/Pipe.hs#L89 should not generally be abandoned except on Done result or by running the final cleanup associated with a HaveOutput pipe final output.
Part of the tricky bit of the zipConduitApp http:///snoyberg/conduit/blob/be803218b5b2acaad2eb720ca3a27a4d0734fba8/conduit/Data/Conduit/Internal/Conduit.hs#L536 combinator is combining two pipes into one in such a way that this is maintained. That is, maintaining the invariant that, if the user does not abandon the combined pipe except as a above, then neither of the two combined pipes will be abandoned except as above also.
As implemented, this is done by keeping track of the last final operation associated with each of the pipes, respectively finalX and finalY, and providing a combined final = finalX >> finalY operation for each HaveOutput generated by the combined pipe.
The problem I see with this is finalX and finalY are return () until the respective pipes generate a HaveOutput. This will cause problems for some pipes. For example, imagine the right pipe does the following
- opens a file (PipeM),
- yields some output (HaveOutput with a close the file finalizer),
- ...
while the left pipe does the following
- yields some output (HaveOutput with a do nothing finalizer),
- ...
The combined pipe will then
- opens the right right pipe's file (PipeM),
- yield the left pipe's output (HaveOutput with a do nothing finalizer),
- ...
The issue here is if the downstream pipe chooses to abort at (2) then the right pipe will not get its cleanup done. This actually extends past the initial condition being set as well. For example, it is possible to imagine a pipe doing a HaveOutput followed by a PipeM operation that acquires further resources and thus invalidates the final operation associated with the earlier HaveOutput.
The only solution I can see is that the combined pipe cannot pass along a HaveOutput (give the user an option to abort with a combined final) unless the combined pipes both have not been allowed to make further progress themselves (i.e., their last action was HaveOutput or Done).
Cheers! -Tyson
PS: I believe there are also issues with the way the Done state is handled. A combined pipe going into the Done state does not currently drop the last HaveOutput final operation associated with it.
If a user then chooses to abort on a combined Output with one of the combined pipes being in the Done state, the combined final operation will run the final associated with the Done pipe's last HaveOutput. This could leading to such things as trying to close the same file twice.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/snoyberg/conduit/issues/293, or mute the thread https://github.com/notifications/unsubscribe-auth/AADBB97OFPi1zQXdNiPkptlNH7j2HDjkks5rWifugaJpZM4LwAg_ .
I was just independently studying the zipConduitApp
code and noticed the confusing handling of finalX
and finalY
which might explain the problem:
go _ finalY (HaveOutput x finalX o) y = HaveOutput
(go finalX finalY x y)
(finalX >> finalY)
o
This executes both finalX >> finalY
as the finalizer, but also passes them both on. If the next item is another HaveOutput x
, then finalY
will be executed again. If it's a HaveOutput y
, then finalX
will be. It's not completely clear to me why the finalX
and finalY
arguments are needed here at all. It seems like this should just be:
go (HaveOutput x finalX o) y = HaveOutput
(go x y)
finalX
o
Finalizers are never lost this way, and each one appears only once. But maybe there's some more complicated case I don't understand this is meant to handle.
Sorry for not getting back to you yet. I will write up an example as you requested. Just have to meet an end-of-month deadlines first. :smile: