rack Add rack.response_finished to Rack::Lint

This updates Rack::Lint to validate that rack.response_finished is an array of callables when present in the env. e.g. procs, lambdas, or objects that respond to call.

This validates that:

rack.response_finished is an array
The contents of the array all respond to call

Part of https://github.com/rack/rack/issues/1777

Feb 04 '22 16:02 BlakeWilliams

Rack::Lint seems to validate that middleware implements the spec correctly but rack.response_finished is completely dependent on the server implementation. Is there a standard way to validate that a server implements specific behavior?

Feb 04 '22 17:02 BlakeWilliams

Rack::Lint seems to validate that middleware implements the spec correctly but rack.response_finished is completely dependent on the server implementation. Is there a standard way to validate that a server implements specific behavior?

No. Rack::Lint doesn't validate that a server handles responses correctly, it only validates inputs (env) and outputs ([status, headers, body]). This implementation looks fine to me. I think we should keep the rack.after_reply name, as Unicorn and Puma already implement it. If we are adding something to SPEC that is already supported and de facto standardized by multiple implementations, we should use the existing name, even if we think our name is better.

Mar 15 '22 20:03 jeremyevans

Is rack.after_reply compatible with the proposed spec?

Mar 15 '22 21:03 ioquatix

Is rack.after_reply compatible with the proposed spec?

Other than the key, it seems to be compatible:

Unicorn: env["rack.after_reply"].each(&:call)
Puma: after_reply.each { |o| o.call }

There is no advantage in adding a different name for behavior already de facto standardized in existing webservers.

Mar 16 '22 15:03 jeremyevans

@jeremyevans I understand your point.

However, since we are NOW formalising it, we could change the name. I don't think after_reply is very meaningful. Whether the value of using an existing defacto-standard outweighs the name, I'd personally vote for the latter. But I understand the former. It's just that I care more about the future than the past.

Mar 16 '22 16:03 ioquatix

Unless we can identify significant problems, I believe we should adopt existing de facto standards as already implemented. @tenderlove agrees (https://github.com/rack/rack/pull/1701#issuecomment-666525691):

If popular web servers all do it, then it's really the defacto spec. Rack's job is to define the interface, and if all web servers agree this is a fine interface then I think we're good to go?

Here the question is just about what env key to use, which is purely a bikeshed issue. The bikeshed is already painted, there is no point in painting it a different color, even if the different color looks nicer.

Mar 16 '22 16:03 jeremyevans

@jeremyevans that's not what @tenderlove said: https://github.com/rack/rack/issues/1777#issuecomment-1023768075

Mar 17 '22 00:03 ioquatix

I would like to add, we CAN validate that these are called by the server, by wrapping them - the same way we validate the body was either called or enumerated IIRC.

Mar 17 '22 00:03 ioquatix

@jeremyevans that's not what @tenderlove said: #1777 (comment)

What I quoted is what @tenderlove said in issue #1701. It is a general statement about how we should treat a de facto spec. It looks like what @tenderlove states in #1777 is in favor of this particular case, even though this case goes against his advice in #1701. I will leave it to @tenderlove to explain why the general statement does not apply to this particular case.

I would like to add, we CAN validate that these are called by the server, by wrapping them - the same way we validate the body was either called or enumerated IIRC.

I suppose we could do that, but I don't see the advantage. Lint is not about validating how servers handle responses, merely that the response provided to the server is in the correct format.

Even assuming we wanted Lint to do this, when could you actually validate them? You could use a timeout that fails if it isn't called. How long would such a timeout be? Would it start when the body was closed? Not sure how such a timeout can reliably raise an appropriate exception, and I think it's a fools errand to try such an approach

Mar 17 '22 00:03 jeremyevans

@jeremyevans I think these are all really good questions. Does the design being hard to test indicate a more fundamental issue?

All the issues you mention seem like reasonable things... like, when does this execute? After 1 year? When the server shuts down? Should we allow it to be deferred? Should it run on a background thread? Does this indicate design issues we need to address?

Apr 04 '22 22:04 ioquatix

All the issues you mention seem like reasonable things... like, when does this execute? After 1 year? When the server shuts down? Should we allow it to be deferred? Should it run on a background thread? Does this indicate design issues we need to address?

I can attempt to answer some of these questions in the spec. I'll try adding some details about how/when these callables should be called under rack.response_finished in SPEC.rdoc, but happy to add it elsewhere if there's a better place.

I'm also happy to revert the name back to rack.after_reply if that's what we want to do. There's a good amount of value keeping the old name since existing implementations will continue to work and servers won't have to go through a full deprecation process due to the name change. Happy to defer to y'all though.

Apr 05 '22 17:04 BlakeWilliams

@tenderlove I think your guidance is needed here to decide whether we should officially bless the de facto standard rack.after_reply (implemented by Puma and Unicorn), or standardize a different (perhaps more accurate) rack.response_finished. The semantics will be the same in both cases, so this just depends on whether the benefit of a potentially more accurate name outweighs the cost of forcing changes in Puma, Unicorn, and apps that currently use rack.after_reply with Puma or Unicorn.

Apr 05 '22 19:04 jeremyevans

I'm fine with either name, but my issues would be:

It means we are stuck adding a spec for whatever servers currently implement (and may be unwilling to change) so we don't have any flexibility to actually define what we want rather than what we already have (which may or may not be good). If we define "after_reply" we need to make sure linting is good enough to ensure they are compliant.
"response" is a more canonical HTTP terminology/concept than "reply" so I think it has a clearer meaning.
"response_finished" to me means feels like a more precise definition of "the point exactly after response has completed sending to the client". "after_reply" seems less well defined but maybe I'm just being too fussy.

The way I see it, this is a more efficient handler for Rack::BodyProxy. The typical usage I imagine would be things like closing database handles or indicating other resources can be freed.

It's not semantically more complex than what we have, but it exposes a 2nd way to achieve the same thing which users should be concerned about. Every env should have this rack.after_reply/rack.response_finished. That means any kind of internal/virtual request (e.g. an app which does several internal requests or maybe does a HEAD request to determine cachability) should add this and execute those callbacks as required... Or is it acceptable to share the rack.after_reply/rack.response_finished to internal requests?

Considering Rack::BodyProxy... in other words, is it acceptable to make a request and not read/close the body? I think it's probably unsafe. So we need to be really clear on how this should factor into the overall life cycle of Rack requests and the expectations of applications (if we make this a mandatory feature/requirement). It also obviously also impacts rack-test type gems which do mock requests.

Finally my last concern is one of responsibility. env provides keys with mutable values. env is considered "request" or "input" state. But we are using it to provide "output" state. It seems like a hack. Maybe it's impossible to fix with rack or maybe my interpretation/definition is wrong. But do we want to encourage this kind of design where env can be stateful. Rack hijack is an example of this and the original interface was pretty messed up since it actually updates env with rack.hijack_io. However in the case of virtual/internal requests, the env object might be a different hash instance than the one the server originally created, leading to confusion. The more we go down the path of env is the request "state", the more we encourage this kind of design, the harder it will be to do anything different. I'm not sure it's a problem, it might just be the state we are in, and I think it's worth commenting that by accepting this PR, we are encouraging more "output" state in "env" which I'm not entirely comfortable with from a complexity POV.

Apr 05 '22 21:04 ioquatix

I like the name rack.response_finished. It makes sense that the callbacks are an array. I'm not really in favor of being able to skip callbacks, and if a callback raises an exception I'd still like the rest of the callbacks to be called.

The usecase I think of is closing database connections after a request is finished. If some callback said everything should be skipped, it might mean leaked connections, and that wouldn't be a good thing.

Open ended question: do we want to do anything with the return value of the callbacks, and do we want to pass anything to them? Or should we just punt that for later?

@BlakeWilliams I think this is a great feature and I'd like to see it in Rack 3.0 (If you have the time of course!!)

Jun 25 '22 00:06 tenderlove

Open ended question: do we want to do anything with the return value of the callbacks, and do we want to pass anything to them?

I doubt there's much to be done with the return value, but passing env might allow some users to avoid needing a per-request allocation, instead of having to rely on closures? 🤔

Jun 25 '22 13:06 matthewd

I doubt there's much to be done with the return value, but passing env might allow some users to avoid needing a per-request allocation, instead of having to rely on closures? thinking

I think this is a good idea, but I recommend it be called with both env and the response array. We could even add that to Lint, checking that if it is called, those are what it is called with. Since that is different from how rack.after_reply works in Puma and Unicorn, I fully support using rack.response_finished as the key in that case.

Jun 25 '22 17:06 jeremyevans

I doubt there's much to be done with the return value

Ya, probably not. I guess I was thinking it might be of use if callbacks knew the status of previous ones. But it's probably best not to share that info.

I think this is a good idea, but I recommend it be called with both env and the response array. We could even add that to Lint, checking that if it is called, those are what it is called with. Since that is different from how rack.after_reply works in Puma and Unicorn, I fully support using rack.response_finished as the key in that case.

Passing the env hash makes sense. I like the idea of passing the response array because the callbacks would probably want to know the response code and possibly the headers. The only thing I'm worried about is passing a body which may have already been iterated over and is no good (but maybe that doesn't matter). I'd guess the callbacks wouldn't be concerned with what's inside the body, so it makes me wonder why pass it in the first place (besides that it's easy to implement).

Jun 25 '22 23:06 tenderlove

@BlakeWilliams I think this is a great feature and I'd like to see it in Rack 3.0 (If you have the time of course!!)

Rad, I'll make some time for it this week and try to get some changes pushed up shortly. I'm 👍 on passing the env to the handlers, especially now that we're explicitly diverging from the current implementations in puma/unicorn.

I had similar thoughts about passing body so I'll start by just passing env to the handlers, but happy to update the PR to pass the response array too if we decide that's the way we want to go.

Jun 27 '22 15:06 BlakeWilliams

I spent some time thinking about this proposal over the weekend. One part of this that I want us to consider is how virtual requests work. I've used these in the past and I won't be the only one.

Essentially, it looks like multiplexing within a Rack application - the ability to inject one or more new request into the rack middleware. These requests may append to the rack.response_finished callback array.

For such a reason, I believe this should be a non-optional feature of Rack 3 as it will make it easier for us to have a guarantee for users that such a thing exists which makes usage easier (e.g. in virtual requests). The other thing to consider is whether we default this to an empty array, or return it in the response headers, etc.

This also raises the question about arguments - whether env should be the original request env or the final request env. We should be clear about when these callbacks should be invoked and how they work in relation to other in-flight sub-requests. My current expectation is they should be called after the body is read, but we might want to say "Even if a request is cancelled, (i.e. the client terminates the request before the full response is written), the callback will still be invoked.

The reason why this matters is because multiple requests on the same thread, with ActiveRecord, adding multiple response_finished entries might cause chaos. So we should consider carefully how this fits together.

Jun 28 '22 00:06 ioquatix

I just pushed an update to the spec and linter to support ENV being passed as an argument, as well as explicitly supporting objects that respond to #call.

Essentially, it looks like multiplexing within a Rack application - the ability to inject one or more new request into the rack middleware. These requests may append to the rack.response_finished callback array.

What does this look like in a web server? Here's the non-multiplexed version might (very roughly) look like once implemented:

# unicorn code, but simplified a lot
def process_client(client)
  env = @request.read(client)
  env["rack.response_finished"] = []

  status, headers, body = @app.call(env)
  # handle closing body/client, etc
rescue => e
  handle_error(client, e)
ensure
  env["rack.response_finished"].map { |c| begin; c.call(env); rescue; end }
end

Looking at the above code and thinking about it in terms of multiplexing, I have a few questions:

Should virtual/multiplexed requests know they're multiplexed or be treated different? If so:
- Do they need a way to check if they're part of a virtual request or a real request?
- Do we need to pass the env["rack.response_finished"] value from one request to the next? (this has concurrency concerns)
- Is it reasonable to expect applications to require code changes to support multiplexing? (e.g. not duplicating rack.response_finished array entries like closing the database connection twice)

The concern about virtual requests/multiplexing is really valid, but I wonder if rack.response_finished is the correct abstraction for that specific case, or if there's another abstraction required, like rack.multiplexed_response_finished. I imagine that rack.response_finished being defined as "run once per env/@app.call" (with better terminology ofc) would still be valuable for both cases.

This also raises the question about arguments - whether env should be the original request env or the final request env.

I think some of the questions about would help inform us here, assuming this is the right abstraction for the multiplexing scenario. I could also see the spec defining it somewhat like "The callable will be passed the env hash it was originally defined on".

My current expectation is they should be called after the body is read, but we might want to say "Even if a request is cancelled, (i.e. the client terminates the request before the full response is written), the callback will still be invoked.

I think that's good to call out in the spec, I'll add that. 👍

Jun 28 '22 15:06 BlakeWilliams

Where we ran into this problem before is apps which create virtual requests, and then call rack.hijack and expect env['rack.hijack_io'] to be defined. But there is no way for the server to know what env should be updated, so in practice updating env was a terrible idea since env can be cloned/duped/completely different from what the server initially created.

e.g.

class MyWebServer
  def handle_request(client, request)
    env = make_rack_env(client, request)
    response = @app.call(env)
    
    env['rack.response_finished'].each do |callback|
      callback.call(env) # and maybe response
    end
  end
end

class MyApp
  def call(env)
    localizations.each do |locale|
      response = @app.call(make_fake_env(env, locale))
      if response[0] != 404
        return response
      end
    end
  end
end

This is a very simple example, but shows some hypothetical virtual requests. It seems like we need to either state: (1) Your callbacks might not be called or (2) A user is responsible for calling callbacks in some cases.

Not sure I really have any deep insight into the correct way forward here.

Jun 29 '22 03:06 ioquatix

I think this setup is safe, unlike the hijack_io one, because either:

the middleware has passed along the server-supplied response_finished array (and the server will handle any callbacks there added), or
the middleware has deliberately supplied its own response_finished (in which case it's taking on the obligation to comply with the optional spec: as a "server" [which it is acting as] it's optional that you define the key, but if you've defined the key, it's not optional that you process its contents), or
the middleware is supplying an env without any response_finished (in which case the app believes/knows it's running inside a server that doesn't support the feature, and will do its thing in-line via a BodyProxy or whatever*).

The unique trouble with hijack was that the server was trying to retroactively inject a new key into the env itself, which breaks as a mechanism the moment any middleware passes along an env that is not the exact Hash instance it originally received.

The one notable consequence of the above reasoning is that the application cannot be certain that a response_finished will be run before the next request occurs on the current thread/fiber/whatever: if a multiplexing middleware passes response_finished through to the outer server, then all subrequests' callbacks will occur together at the end of the main request. I think I would claim that is the better result, though. If something must occur at an exact time (such as AR connection cleanup) you should run it yourself; to me response_finished is a better fit for something that should happen (to the best of the server's ability, processes do sometimes die, etc) at some point in the near future when the client is not waiting. In the multiplex case, that last criteria is only satisfied after all the subrequests have been handled, so it follows that that's when all the callbacks will occur. (If something AR-cleanup-shaped was slow enough to want to defer, an application could choose to use in-line execution to detach the connection from the current thread, but then leave it in a separate pool to be fully slow-cleaned later by the callback.)

* as a follow-up to this spec change, Rack::BodyProxy should probably grow an API that means "run this block after the response, using response_finished if available but otherwise via BodyProxy"

Jun 29 '22 11:06 matthewd

@matthewd those are excellent points.

Jun 30 '22 00:06 ioquatix

I just pushed up a few more updates and think this is ready for another round of reviews.

I think most of the feedback has been addressed (unless I missed something, feel free to call it out!) but there's one more remaining question about naming here and wether or not to append _callbacks or some other word that helps clarify that it's an array of callables.

Jul 11 '22 15:07 BlakeWilliams

I was thinking about this a bit more and I do think we should make handling this field mandatory Rack 3 feature because it ultimately makes it much easier to deal with.

However, unlike rack.hijack or other metadata, I can't see why there is value having this as part of the request. Shouldn't it be part of the response headers?

What's the justification for having this as part of env and not part of response[1] (response headers). I guess before it's because we needed to signal the feature was available. If it's part of env, it means the array must always be allocated even if never used.

But if it's a required feature, does it make more sense to be a response metadata?

Jul 12 '22 03:07 ioquatix

Hmmm. Putting anything that's not an actual header value into the response headers feels like a pretty big / far-reaching change, both conceptually and in impact to anyone who might be reading that hash.

Jul 12 '22 06:07 matthewd

I was under the impression we already put rack specific metadata in the response headers. Let me check...

edit: Yeah, you can put rack.hijack into the response headers - a proc which is called to generate the output. So this is typical.

I'd be all for "response headers" are all HTTP headers, but that's something beyond the scope of this issue, in a way, we'd need to disallow rack.hijack (which I think is the only actual usage?).

Jul 12 '22 11:07 ioquatix

I'm going to manually merge this.

Jul 29 '22 08:07 ioquatix

@ioquatix I saw https://github.com/rack/rack/pull/1932 but was curious, how did we land on the name rack.closed over rack.response.finished?

Jul 29 '22 13:07 BlakeWilliams

I also don't understand why this was closed and #1932 was opened. As far as I can see, the name rack.closed was never discussed prior to this being closed. I think rack.after_reply, rack.response_finished, rack.response.finished, and rack.response.finished_callbacks are all better names than rack.closed. Are there any other differences between this pull request and #1932 other than the name?

Jul 29 '22 14:07 jeremyevans