Terraform should exit gracefully on SIGPIPE
Terraform Version
v1.3.0
Terraform Configuration Files
n/a
Debug Output
n/a
Expected Behavior
Terraform should flush and unlock state and exit gracefully when a SIGPIPE is received, as it does for SIGINT and SIGTERM.
Actual Behavior
Terraform exits abruptly leaving state unflushed and locked, leaving the infrastructure potentially out of sync with the state.
Steps to Reproduce
Run terraform apply and pipe the output to another program, say less. While resources are being read (before you are asked to confirm) exit the program reading from the pipe. Terraform will die and exit uncleanly.
Additional Context
I'm not an expert in this area by any measure, but given the bad outcomes that can result from Terraform exiting uncleanly I think it would be nice to handle this (admittedly niche) case.
References
No response
Thanks for reporting this, @twbecker.
Indeed, it doesn't seem ideal for Terraform to just immediately exit if it's writing to a pipe whose read end has closed.
I suspect that resolving this will be easier said than done, though: if we make the Terraform process ignore SIGPIPE then it won't immediately quit but any subsequent write to stdout/stderr (assuming both are connected to the closed pipe) will then fail with EPIPE. The consequences of that will depend on exactly which codepath is the next one to write output: some writers to stdout/stderr are probably just ignoring write errors and continuing, but we can't assume that is true for all writers without some careful review.
With that said, I will note that the specific situation you described is not as problematic as it might first appear. Until the plan is confirmed (e.g. by typing yes or by overriding the prompt using -auto-approve) Terraform hasn't yet made any changes to any real infrastructure, so the worst case for an unclean exit is leaving the state lock active and requiring terraform force-unlock to recover.
If Terraform recieves EPIPE while it's working on the apply phase then it could be worse, of course: in that case, Terraform is likely to have made changes to infrastructure that are not yet persisted in a new state snapshot, and so Terraform will lose track of those changes.
So with all of that said: I agree that this seems good to fix, but I also am not sure it's an easy fix and therefore we'll need to spend some time researching the practical implications of ignoring SIGPIPE before we do it. I will say that my instinct is that ignoring SIGPIPE and then immediately returning an error from an EPIPE is likely to still get a better result than immediately exiting, but we do need to make sure first that Terraform will handle it sensibly and not e.g. get wedged and never exit, in some cases.
Thanks again!
Thanks for the response. I will admit to simplifying the steps to reproduce in this report relative to what we're actually hitting. So yes you're right that exiting uncleanly prior to the apply actually starting is safe, but we were seeing it during apply where it is decidedly not. I understand what you're saying about the EPIPE potentially happening on a number of codepaths though. Would it not be possible to configure the same handler for SIGPIPE that is already used for SIGTERM rather than ignoring it completely?
Indeed, when I said "ignore SIGPIPE" I meant specifically to not exit in response to it. Terraform could potentially take other actions, but the key thing is that if we continue executing after receiving that signal then it's likely that we hit an EPIPE error on write shortly afterwards, which isn't possible with the current implementation and thus we can't be sure of how different parts of Terraform will treat it, and in particular whether it will always lead to a graceful exit or whether it'll just cause a different kind of ungraceful exit due to a new error situation that Terraform hasn't previously needed to deal with.