plz.el icon indicating copy to clipboard operation
plz.el copied to clipboard

Support streaming response

Open r0man opened this issue 2 years ago • 70 comments

Hi @alphapapa and @ahyatt,

This changed adds support for streaming responses by adding a process filter and a callback function. The process filter is based on the default default process filter as described in the manual. It inserts the response into the process buffer. Additionally, if a callback function is provided by the during keyword, it is called every time a part of the response body is received.

I'm interested in writing a curl based provider for the llm library and saw the following issue here: https://github.com/alphapapa/plz.el/issues/42

Would you like to collaborate in resolving this issue?

Thanks, Roman.

r0man avatar Dec 03 '23 23:12 r0man

This is great, thank you for adding this!

ahyatt avatar Dec 04 '23 04:12 ahyatt

Hi @r0man,

This looks like a very good start. Two issues that I can think of quickly are:

  1. Have you done the FSF copyright assignment? plz is in GNU ELPA, so it's required for major contributions like this.
  2. There's a use case that I'm not sure this code would easily support, that of wanting to discard incoming data once a complete object of some kind (like a JSON entity) is received; i.e. it wouldn't be desirable for the object to remain in the process buffer after it's been processed by another function. IMO this is the hard part of this feature's design: we want to make it easy to follow that pattern without requiring the user to write a full process filter.

Thanks.

alphapapa avatar Dec 04 '23 14:12 alphapapa

Hi @alphapapa and @ahyatt,

  • I will extend the doc string more once we settled with a strategy.

  • I just started the FSF copyright assignment process.

  • Regarding discarding of already consumed data, yes this is not supported with the current code. It just leaves everything in the buffer, and there is nothing done to make sure a full JOSN object is passed to the callback, which I agree isn't great :)

What do you think of the following:

Streaming of any data format

To support streaming of any data format, I think one strategy could be to call the during callback with the partial data currently received and delete everything that has previously been in the buffer. Thus moving the job to construct the response to the client. Since we don't have the full response anymore, we could call the then callback with nil as an argument at the end of the stream, indicating the connection has been closed.

Streaming of SSE (server sent events)

There's a standard called SSE, which the OpenAI API uses for example. It is basically text separated by 2 newlines. We could write a process filter that works with this format. We pass complete lines to the during callback, clear the buffer when we receive 2 newlines, and at the end of the stream call the then callback indicating the connection has been closed. Server sent events don't have to be in JSON format, but we could support them by decoding the lines, maybe leveraging the existing :as option with json-read or something similar.

[1] https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events#event_stream_format

Line based

The /stream endpoint of httpbin.org seems to be line based, instead of following the SSE standard. So maybe we should support this one as well?

Other data format

Do you have any other data format in mind that I did not mention?

Wdyt?

r0man avatar Dec 04 '23 19:12 r0man

Perhaps it might be possible to two new arguments: a callback (you already did this) plus a predicate that says when the callback is to be used (a message called with the latest information, returning non-nil if the callback should called).

ahyatt avatar Dec 05 '23 04:12 ahyatt

This is timely in a few ways: the hyperdrive.el project (presented at EmacsConf last Sunday) uses plz and has sponsored some of the work on it. One of the features we are working toward is to use a JSONRPC-style protocol, similar to the LSP one used in eglot. There are a number of different such protocols, like EventSource/SSE, JSONRPC, LSP-style JSONRPC, etc.

Notably, LSP uses Content-Length headers so the receiver can count the number of bytes received and know when a full object has been received and is ready to parse--this is different than standard JSONRPC, which required us to override some of the code in the jsonrpc library when testing. The JSONRPC server being used by hyperdrive does not (yet?) support such a header, and we want to avoid trying to parse incomplete JSON objects for performance reasons.

I've been discussing some of this with João Távora on emacs-devel. Over at hyperdrive, we're not yet sure what solution we're going to pursue. And as hyperdrive moves toward that, it will likely move away from using HTTP altogether.

But it would be good to keep all of these things in mind as we try to design a solution for plz to support streaming responses.

Given how many variables there are, I'd suggest that we not merge anything into master and make a release with such a feature until we've prototyped and tested it extensively. I'd like to avoid having to change the API later to fix design mistakes made in the beginning. Also, we should probably review similar network libraries in other FOSS to see what we can learn (e.g. Python, JS, etc). Then we can decide on the best approach for Elisp and how to code it for Emacs's internals.

Ideally, I think we would have test cases to support each of the three formats we've mentioned: EventSource/SSE, JSONRPC, and LSP-style JSONRPC (perhaps also considering the JSON Lines format, which has some overlap). Then we could more easily see what the code would look like and try to make it easy to use.

@r0man Since your interest lies with llm, maybe you could write a test in llm that uses this in-development feature of plz that could help us learn how to best design it. Then after you get the FSF CA done, maybe a similar test could be added to plz's own tests.

Thanks to all for your interest in helping with this feature.

alphapapa avatar Dec 05 '23 16:12 alphapapa

If this could be a new branch, that would help. I could also experiment with using plz in llm, as presumably @r0man could also do their separate experimentation. This is something I probably will have time to play around with later this month.

ahyatt avatar Dec 07 '23 06:12 ahyatt

@alphapapa and @ahyatt Sounds good to me. I can take a look at the llm library and how I could use plz with it. In fact I already did some experiments on using plz in llm. I basically added a llm-request-plz.el file, that has the same interface as the Emacs native llm-request.el. I just stopped when I realized that I need streaming support for it.

@alphapapa What's not clear to me from your message is, should I only focus on the llm stuff and you are looking into the streaming support in plz. Or do you want me to continue on plz.el with what I described above?

r0man avatar Dec 07 '23 12:12 r0man

@r0man As long as the FSF copyright assignment is or eventually gets signed by you, I'm happy for you to experiment with code and APIs to add streaming support in your branches of plz (otherwise I couldn't use the code, and it could be awkward to develop alternative code that does the same thing). I think what is needed here is a somewhat broad experimental phase so we better understand the problems and the tradeoffs of various solutions.

alphapapa avatar Dec 07 '23 17:12 alphapapa

Hi @alphapapa and @ahyatt,

priorities changed a bit and I might (nothing promised, priorities change quickly) look into this again. I'm also nearing the completion of the FSF copyright assignment process.

I was still thinking howto extend plz.el to support streaming. One suspicion I have is that it involves a process filter and the handling of the response needs to be tweaked. While thinking about this, looking around the code, and my previous experience working on clj-http and cljs-http I would suggest the following:

Right now the user specifies a function in plz.el which parses the response. This is up-front. If you specify you want JSON via passing in the json-parse function, the response is parsed as JSON. Regardless if it actually is JSON. In an ideal world if you send an Accept header ask for application/json you should also get it back. But I also have seen many cases where this breaks along the way. In the OpenAI API which I would like to use, the response format is triggered by a parameter in the request. If you set stream to true in the body of the request, it sends a response in the text/event-stream format, otherwise in application/json. So you need to coordinated this in 2 places (if you even send an accept header).

An Accpet header can also contain multiple mime times with different priorities and the server might do continent negotiation, and you might not know what you get in advance.

So, my proposal would be to change the response handling code to be more flexible, and actually look at the Content-Type header when parsing the response. The idea is we have a process filter that reads the status code, the headers, and then decides which function to call (maybe multiple times in the case of streaming responses) depending on the Content-Type. If the user passes in a function we use that one, otherwise we look up an handler in an alist from content type to handler function. plz.el could provide a default content-type to handler alist, which could be overridden by the user. All this with keeping the current interface intact.

Another thought I had was about middleware. I'm not sure if you are familiar with how clj-http works, but they have this concept of middleware. There is a request map and there is a response map, and both are getting passed through mutiple middleware functions that have all the same signature. Each middleware can modify the request before actually sending it, and also modify the response before it actually gets passed back to the user. There is a middleware to encode input, another to decode output, yet another that handles compression and so on. At the end all the middleware gets composed together and you get a "batteries included" HTTP client. You can see an example of it here:

https://github.com/dakrone/clj-http/blob/3.x/src/clj_http/client.clj#L1125

The nice thing about this design is that you can mix and match those middleware functions. If the default client does not fit your needs you can build your own one with the middleware you need and maybe add additional.

Maybe that ship has already sailed, but I wanted to mention this anyway.

Wdyt? (mostly about the first part)

r0man avatar Feb 22 '24 08:02 r0man

I'm super happy that you can hopefully devote more time on this. I like the ideas in the clj-http client - such things could be useful to implement lower-level logging.

I agree with @alphapapa's proposed approach above: implement something you think is reasonable - and I'll try it out as a backend for the llm package and let you know what I think.

ahyatt avatar Feb 22 '24 16:02 ahyatt

Hi @r0man,

priorities changed a bit and I might (nothing promised, priorities change quickly) look into this again. I'm also nearing the completion of the FSF copyright assignment process.

No problem. Thanks for the update.

So, my proposal would be to change the response handling code to be more flexible, and actually look at the Content-Type header when parsing the response. The idea is we have a process filter that reads the status code, the headers, and then decides which function to call (maybe multiple times in the case of streaming responses) depending on the Content-Type. If the user passes in a function we use that one, otherwise we look up an handler in an alist from content type to handler function. plz.el could provide a default content-type to handler alist, which could be overridden by the user. All this with keeping the current interface intact.

For non-streaming responses, this should already be doable by using :as 'response, and then the user code can do whatever it likes with regard to examining the response and handling its contents.

For streaming ones, yes, it will be necessary to use a bespoke process filter.

I'd like to keep the API unchanged as much as possible, so my suggestion would be to start by copying the plz function to one named plz-stream, and then experiment with the code and APIs to make it work usefully. Once that's done and seems to work reliably, we could consider whether its features could be integrated into the main plz function (e.g. using an :as 'stream argument, maybe with :through FILTER-FN, or :as '(stream :through FILTER-FN), or something like that).

Thanks for the information about clj-http and its middleware API. That sounds like a flexible design. I'm not sure that I'd want to change plz to work that way, as it seems likely to be overkill for our needs. As well, user applications should be able to build such a layer on top of plz if necessary.

To that end, if an application did have a need like that, I'd suggest that we consider implementing a layer on top of plz that would work with a plz-request struct that could be passed through a chain of modifying functions and finally to plz itself. Or it might not even need a new struct; maybe just a list of the args applied to plz would work, ala plz-queue.

@ahyatt As you may have noticed, I've so far tried not to commit much logging code to plz; whenever I've used some during debugging, I've removed it before committing. That's mainly to try to keep the code as clean as possible. But if you encounter some situations where you need better logging support, please let me know and maybe we can come up with a good general solution.

Let me know what you think. Thanks.

alphapapa avatar Feb 23 '24 02:02 alphapapa

@alphapapa Alright, thanks for your answer. I will give your suggestion a try.

r0man avatar Feb 23 '24 10:02 r0man

Hi @alphapapa and @ahyatt,

I updated this PR. This is what changed:

  • I kept the API mostly the same, I removed the DURING callback which I initially introduced, I think we don't need it.
  • I added the STREAM keyword parameter. When non-nil, a process filter is added. This filter adds a chunk of the response to the process buffer, calls the existing :plz-then handler, and deletes the chunk from the buffer again. This is done repeatedly until the request is complete.
  • The THEN callback is called multiple times with the current chunk of the response (neither complete, nor cumulative, and no guarantee that it's a complete line, or even a valid JSON object).
  • The type of object passed to THEN depends on the AS parameter, it might be a chunk of the body of the request, the chunk in the body slot of a plz-response struct, etc.
  • As usual, the ELSE and FINALLY callbacks are called for errors, and when the request is complete.
  • I haven't done much about when AS is set to file, since the current behavior seems to be that it is an error if the file already exists. We could append the chunks to the file if STREAM is non-nil?

I think this is the least invasive change I could come up with.

@ahyatt is this something that would work in the llm library? I have a version of the llm-request.el file that uses plz.el for my custom provider I use at work and I believe it would be sufficient. We still would need to interpret the server sent event protocol by piecing together the chunks there though.

Yesterday I was also experimenting with some approaches that would add support for server sent events somehow into plz.el. Basically what @alphapapa suggested in his first message as the second point/issue. I could not come up with something satisfying yet. There are multiple places (the process filter, the :plz-then callback) that would need to be kept in sync, customized and play well together. The protocols are also quite different as I understand now. Server sent events for example are not "just" line based JSON blobs, but lines with prefix, then a JSON blob. JSON RPC is different as well. I' starting to think the code parsing all those different protocols would be too complex to bake it into plz.el, but instead should be built on top of plz.el, as separate libraries.

Wdyt?

r0man avatar Feb 24 '24 12:02 r0man

@r0man That looks pretty good in general. I'm really glad to see the tests included. A few suggestions:

  1. I think we should try to keep the THEN function as the callback for when the request completes successfully. That would likely still be needed for streaming responses, and it would avoid changing the meaning of that argument.
  2. And instead of :stream t, I'd suggest we use the AS argument, e.g. :as '(stream FILTER-FN) (or possibly :as '(stream :through FILTER-FN), which would potentially allow other options to be included in the future). And in that case, the THEN function would be documented to be called without arguments after the request completes.

Thanks for working on this.

alphapapa avatar Feb 25 '24 07:02 alphapapa

Hi @alphapapa,

thanks for your suggestions. Before I pushed this, I actually used the :as keyword as you suggested, but then changed it back, because I realized that I might still want the same functionality for streaming which is controlled by :as. For example, receive the chunks in a plz-response struct instead of just the body, so I can take a look at the headers, influence the encoding via :as :binary. That were at least the use cases I had in mind. I'm open to change it, but I have some doubts/questions:

Regarding your suggestions:

  1. Are we really changing the meaning? Right now :then is called on a successful response, just multiple times with a chunk instead of the full response. On errors the :else callback is called (as usual) and when the request completes the :finally callback is calls (also as usual). In 2. you say we should document the THEN callback to be called without arguments when the request completes. Aren't we changing the meaning with this? If we do so, what is then the difference between the :then and the :finally callback? Wouldn't both be called with no argument when the request completes? To me that seems a bit confusing, but maybe I miss something?

  2. What would FILTER-FN be here? Do we ask the user to pass in a process filter, like the one I use (plz--stream-process-filter)? Since you called it FILTER-FN I assume you mean a process filter, so should there be a default one? Still assuming FILTER-FN is a proccess filter, do you have a suggestion/preference (name and signature) for the callback function that receives the chunks? And what should the chunks be (just the chunk itself, or wrapped in a plz-response struct, so the user has access to the headers?

Your suggestions raised those questions, which I belive are kind of solved in the current design. I took here an example of one of my tests. This is how my current suggestion could be used:

(plz 'get (url "/stream/100000")
  :as 'response
  :stream t
  :else (lambda (error)
          (message "Request failed: %s" error))
  :finally (lambda ()
             (message "Request complete"))
  :then (lambda (response)
          (let ((status (plz-response-status response))
                (headers (plz-response-headers response))
                (chunk (plz-response-body response)))
            (message "Partial response: %s %s %s" status headers chunk))))

Could you please show me this example with your suggestions applied, please? I would be interested in how the user could inspect the headers before/while consuming chunks, and how you imagine the callbacks. I think that would help me implementing your suggestions.

Thanks

r0man avatar Feb 25 '24 08:02 r0man

Hi @r0man,

Well, we shouldn't be making a response object and testing its properties for every chunk received. If the response code or headers need to be tested, that should be done once, before processing the first chunk; and then the filter function should be called for each chunk, and its job should be only to handle the chunk's data.

The difference between THEN and FINALLY is that THEN is called only for a successful request, while FINALLY is called regardless of whether a request is successful.

I don't think the THEN function must necessarily be called with an argument; it already can be called with different values, depending on the AS argument. It would seem to me to make sense to call it without an argument for a streaming response, because the response body is to be handled by the filter function. (If the user wants to assemble the data from a stream and finally do something with it, that can be handled by making the THEN function a closure, etc.)

That leaves us with the question of how to inspect the response object for a streaming request when necessary. So, building on what I just wrote, and changing it slightly, I'd suggest this: for a streaming request, the THEN function should be called with a response object after receiving the headers. I think it should be possible for that function to then cancel the request if necessary, e.g. by deleting the request process. After the THEN function returns, the filter function should start being called for each chunk. And after the request completes, the FINALLY function should be called.

As for the filter function: I don't think we need to provide a default one (i.e. other than Emacs's own default), because the point of making a streaming request is for the user to be able to handle the data chunks with a custom filter.

I would suggest that we consider offering an API whereby the filter function is called in the process buffer with it narrowed to the current chunk of data; as opposed to the default of calling the filter with the data as a string, that could potentially avoid allocating a string just to be thrown away after the user does something with it. It could also potentially allow us to leave the whole response body intact in the buffer while still offering streaming chunks to a filter, if the user so desired.

In terms of your example, I'm imagining something like this:

(plz 'get (url "/stream/100000")
  :as `(stream
        :through ,(lambda ()
                    (message "Chunk received:%S" (buffer-string))))
  :then (lambda (response)
          ;; TODO: Add `process' slot to `response' struct.
          (pcase-let (((plz-response status headers process) response))
            (pcase status
              (200 nil)
              (_ ;; Cancel the request for some reason (which will
                 ;; trigger the ELSE function).
               (delete-process process)))))
  :else (lambda (error)
          (message "Request failed: %s" error))
  :finally (lambda ()
             (message "Request complete")))

What do you think? Thanks.

alphapapa avatar Feb 25 '24 08:02 alphapapa

Hi @alphapapa,

thanks again for your reply. This example clarifies my open questions. Let's call the function we pass to :through CHUNK-FN. Because FILTER-FN might indicate it's a process filter (at least that was what I was thinking). In your example I believe we still install a process filter, which is similar to what I do in plz--stream-process-filter. And the job of the process filter is to insert a status, headers and chunks into the process buffer, call the THEN function when the headers are there, and then continuously call the CHUNK-FN (which then can grab the chunk via buffer-string from the narrowed process buffer.

Leaves one last open question. Who is going to delete the chunks? The process filter after calling the CHUNK-FN, or the user provided CHUNK-FN itself? I would lean towards the CHUNK-FN, since it may accumulated chunks until it can parse a full object anyway, and it only is a (delete-region (point-min) (point-max)) away?

Apart from the last question I think I now have a plan.

Thanks

r0man avatar Feb 25 '24 09:02 r0man

Hi @r0man,

My apologies, I forgot that the process filter is already called with the data as a string, and that it's responsible for inserting the chunks into the process buffer. Also, you raise a good point about the filter's needing to insert the response headers. (It's getting late here, so I should probably let this be my last response for the day.)

So, thinking some more, what I'd suggest is that we use a filter function that works basically like this:

  1. Insert the headers until the end of headers is detected.
  2. Make a response object for those headers and call the THEN function.
  3. Assuming that the THEN function returns without deleting the process or signaling an error, set the process's filter function to the user-supplied filter function, and call it with any existing response body data.
  4. Emacs should then continue calling the user-supplied filter function for the rest of the response data as it arrives.

Let me know what you think. Thanks.

alphapapa avatar Feb 25 '24 09:02 alphapapa

Hmm,

so this is what the Emacs manual says about filter functions [1]. At a minimum it should do what the following ordinary-insertion-filter function does:

(defun ordinary-insertion-filter (proc string)
  (when (buffer-live-p (process-buffer proc))
    (with-current-buffer (process-buffer proc)
      (let ((moving (= (point) (process-mark proc))))
        (save-excursion
          ;; Insert the text, advancing the process marker.
          (goto-char (process-mark proc))
          (insert string)
          (set-marker (process-mark proc) (point)))
        (if moving (goto-char (process-mark proc)))))))

If we start with an initial filter function that inserts the status and headers, and then set the process filter to the CHUNK-FN function passed via the :through argument, we also need to change it's signature to take the process and string parameters. Not a big deal, but this function then also needs to handle the machinery of what ordinary-insertion-filter does (in case you want something to be inserted in the buffer). I think the function becomes more complicated fort the end user, given that ordinary-insertion-filter is not even defined.

I believe our example would then change to:

(plz 'get (url "/stream/100000")
  :as `(stream
        :through ,(lambda (process string)
                    ;; Do what a process filter is expected to do.
                    (ordinary-insertion-filter process string)
                    ;; Handle the chunk:
                    ;; - Option #1: Handle the chunk (passed as string to the function.
                    (message "Chunk received:%s" string)
                    ;; - Option #2: Switch to the process buffer and grab the string.
                    (with-current-buffer (process-buffer proc)
                      (message "Chunk received:%s" (buffer-string)))))
  :then (lambda (response)
          ;; TODO: Add `process' slot to `response' struct.
          (pcase-let (((plz-response status headers process) response))
            (pcase status
              (200 nil)
              (_ ;; Cancel the request for some reason (which will
                 ;; trigger the ELSE function).
               (delete-process process)))))
  :else (lambda (error)
          (message "Request failed: %s" error))
  :finally (lambda ()
             (message "Request complete")))

I think the user provided CHUNK-FN having access to the process is actually good, because it can then also cancel the request whenever a chunk gets received, and not only via the THEN function (which is called only once after the headers arrived). I'm a bit worried about the additional complexity of the CHUNK-FN function being a full process filter function. I'm not entirely sure that the user provided function needs to do the everything the ordinary-insertion-filter function does, so maybe it can be simplified into checking if the buffer is alive, switching to it and just inserting the string. Another worry I have is if changing the process filter inbetween has any consequences.

I'm going to experiment with this in another branch, then we can see how it goes, and talk about it tomorrow.

Thanks again for your help!

[1] https://www.gnu.org/software/emacs/manual/html_node/elisp/Filter-Functions.html

r0man avatar Feb 25 '24 10:02 r0man

Quickly, I don't think it's necessary to do what "ordinary-insertion-filter" does. The manual says, as you linked it, "A filter function must accept two arguments: the associated process and a string, which is output just received from it. The function is then free to do whatever it chooses with the output." Presumably, the user would want to do something else with the chunks, not insert them into the process's buffer as the default filter does, so we should not force that to happen ourselves. If the user wants to, e.g. inspect the chunk and conditionally insert it into the process's buffer, then the user is responsible for reading that page in the manual and writing the bespoke filter function according to that example.

Ok, that's really all from me for tonight. :)

alphapapa avatar Feb 25 '24 10:02 alphapapa

Hi @alphapapa, I pushed a v2 with your suggestions. Please let me know what you think!

r0man avatar Feb 25 '24 16:02 r0man

Hi @r0man , thank you for this change! I tried this out today, and is definitely an improvement to llm's old code using the url library.

I think the design is a bit confusing, though - the difference between the function in the :as (stream (lambda ...)) and :then seems strange to me. Which is called first? Why two functions when it seems like one would do? Plus, for me, I need to give the client a final callback, and for that I need the process buffer. However, the :finally gets a function that takes no arguments; I'd instead like a way to get the process, because I need to call those callbacks in the process buffer. I'm not sure how to do that yet.

ahyatt avatar Feb 25 '24 22:02 ahyatt

Hi Andrew,

I think the design is a bit confusing, though - the difference between the function in the :as (stream (lambda ...)) and :then seems strange to me. Which is called first? Why two functions when it seems like one would do? Plus, for me, I need to give the client a final callback, and for that I need the process buffer. However, the :finally gets a function that takes no arguments; I'd instead like a way to get the process, because I need to call those callbacks in the process buffer. I'm not sure how to do that yet.

The design is still up for review, so thanks for your comments on it. Right now, the THEN function is called with the initial response object so the status code and headers can be inspected by you before processing the streaming response with the filter (THROUGH) function. The response object now has a process slot, from which you can get the buffer.

Can you explain more about your use case? Why do you need to stream the data through a filter and also have access to the process buffer when the streaming is done?

alphapapa avatar Feb 26 '24 02:02 alphapapa

@alphapapa thanks for explaining - it clears things up, but it still seems like the relationship of these two parts is confusing. AS is supposed to be how you call THEN, so it'd be clearer if THEN were called multiple times for streaming responses and we could then say :as streaming.

As to why I want access to the process buffer when the streaming is done, in the llm library we have a streaming callback and a final callback. The final callback in many cases needs to parse out the response, and I use buffer local variables to hold where we last parsed to and the previous contents, so I don't have to re-parse things on every part of the streaming response. So when the request is done in plz, finishing the processing with the buffer available would be ideal. Otherwise I have to re-parse everything (which is possible, but I'd like to avoid if I can).

ahyatt avatar Feb 26 '24 04:02 ahyatt

@ahyatt

thanks for explaining - it clears things up, but it still seems like the relationship of these two parts is confusing. AS is supposed to be how you call THEN, so it'd be clearer if THEN were called multiple times for streaming responses and we could then say :as streaming.

If that were so, then:

  1. How would we specify a function to be called after the requests completes successfully? The THEN function serves that purpose now.
  2. How would the user inspect the response headers at the beginning of the response?
  3. What would be the arguments to the THEN function? Would it also be the process filter?

alphapapa avatar Feb 26 '24 06:02 alphapapa

  1. How would we specify a function to be called after the requests completes successfully? The THEN function serves that purpose now.

That's the issue, with the current implementation the THEN is called as the first thing that happens, not the last. One solution is to have THEN be called repeatedly when receiving streaming information, with the response code. The response body should have the complete body but the delta can also be passed in.

  1. How would the user inspect the response headers at the beginning of the response?

In my proposal, they would always just have code that looks at the response header and processes it if it is in the 200 range, etc. The check would just be doing the same thing every single time, which is a disadvantage, but not a huge one IMHO.

  1. What would be the arguments to the THEN function? Would it also be the process filter?

Yeah, that's what I'm thinking. So (response new-content) would work, although it'd also be OK to have plz-response to have a slot new-body-content or something like that, so you wouldn't have to have an extra argument for the streaming case.

ahyatt avatar Feb 26 '24 15:02 ahyatt

Hi @ahyatt and @alphapapa,

@ahyatt, have you seen my initial proposal from this week where I used the :stream t flag, and controlled what the THEN function receives via :as ? You can still find it here:

https://github.com/alphapapa/plz.el/pull/43/commits/a17074f2ca6d891fb744f2aa103efd109a1adde4

I believe this is closer to what you are asking for, but it still has some of the flaws we see here.

Regarding @alphapapa's question, I would answer them for my initial proposal with:

  1. How would we specify a function to be called after the requests complete successfully? The THEN function serves that purpose now.

With my initial proposal, where the THEN callback is invoked multiple times, you don't know when it is the first/last time. The same applies to the current implementation using :as (stream :through PROCESS-FILTER) for the PROCESS-FILTER function. A function that is called mutliple times without an indicator if it is the first/last time is not a good fit for flow control, like calling a function to continue on success, or another on failure. And I think flow control is what we are after with this question.

Since the function that is called multiple times is not suitable for flow control, we are left with ELSE and FINALLY. ELSE already works as expected for the error case I think. In both implementations and we can't use ELSE to answer @alphapapa's question here.

Leaves us with FINALLY. In an ideal (or rather my) worldm I would prefer FINALLY to receive one argument, either the response object or a plz-error struct. If you "finally" want to decide what to do next, having a response struct or an error object at hand would be a really good thing to have to make that decision. I think this would solve flaws in both implementations at hand. So:

  • Could we come to that (my) ideal world?
  • Can we get there without breaking existing code?
  • Can we check in Emacs Lisp the arity of a function before calling it? That way we might be able to introduce it?
  • Alternativly, can we add a new callback handler that does take one argument, response or error, and is called exactly once when the request is complete?

@alphapapa do you know why FINALLY does not take such an argument? I guess it wasn't needed back then.

  1. How would the user inspect the response headers at the beginning of the response?

In my initial implementation you could do this when the THEN callback is invoked, but you don't know if it is the first/last time. I have the impression this use case is also not as common as the "Did the request complete successfully or not?" and could be worked around somehow.

One of the reasons to look at the response headers is to decode the HTTP body based on them. Because of that it would be nice to have them close to where you process BODY. We don't have this in the current implementation. The HTTP headersare available in THEN, the body chunks you process in the PROCESS-FILTER, but the filter does not have access to the headers. With the initial implementation you have both, headers and a chunk of the body in THEN (if you set the :as to 'response) and could make decisions based on it.

  1. What would be the arguments to the THEN function? Would it also be the process filter?

In my initial implmentation this depends on the :as argument, it could be the current chunk as a string, or the current chunk in the body slot of a response struct (if :as 'response is used). Is isn't a process filter, and the user does not have to deal with them.

My proposal would be:

  • Add argument to FINALLY for the response/error, or if this can't be done in a backwards compatible way, add new callback. This would solve the problem of flow control.
  • Use THEN as the chunk function (called multiple times) and make it possible to get at the response (status, headers, current chunk).
  • Decide if how we trigger streaming. I don't really have a preference for :as or :stream.

Wdyt?

r0man avatar Feb 26 '24 19:02 r0man

Hi @r0man,

  • Add argument to FINALLY for the response/error, or if this can't be done in a backwards compatible way, add new callback. This would solve the problem of flow control.

ISTM that flow control is already solved: THEN is called upon success, and ELSE upon failure. FINALLY is intended as a convenience allowing the user to easily do something that's needed regardless of success or failure, which is why it takes no argument: it's called regardless. If the user wants to use FINALLY to do something more sophisticated, that should be handled easily enough by making it a closure and setting a variable in it, e.g. from THEN or ELSE (i.e. making them closures over the same variable). For example:

(let* ((var)
       (then (lambda ()
               (message "THEN: %S" var)))
       (else (lambda ()
               (message "ELSE: %S" var)
               (setf var 'ELSE)
               (message "ELSE: %S" var)))
       (finally (lambda ()
                  (message "FINALLY: %S" var))))
  (funcall then)
  (funcall else)
  (funcall finally))
THEN: nil
ELSE: nil
ELSE: ELSE
FINALLY: ELSE
  • Use THEN as the chunk function (called multiple times) and make it possible to get at the response (status, headers, current chunk).

I don't want to do this because 1) it would change the semantics of THEN, to call it multiple times; 2) it would force parsing of the response, which may be unnecessary; 3) it would force us to either re-parse the response for every chunk, which would be very wasteful (especially if trying to keep up with a fast stream of data), or to cache it even if it's not needed; 4) it would complicate the signature of the THEN function.

  • Decide if how we trigger streaming. I don't really have a preference for :as or :stream.

I'm generally in favor of avoiding additional arguments when existing ones can be repurposed. For example, :as '(file FILENAME) rather than something like :as 'file :filename FILENAME. You can see this same pattern in call-process:

If DESTINATION is ‘(:file FILE)’, where FILE is a file name string,
 it means that output should be written to that file (if the file
 already exists it is overwritten).

I think it's important here to not conflate the THEN callback and the process filter for a streaming download. They are two different things, and we shouldn't try to make one the other. That will be bound to cause confusion for users.

I think the best solution is to leave THEN's semantics as-is. To get a streaming download, pass :as (stream :through FILTER).

If the response headers need to be examined before processing the first chunk, maybe we should handle that differently, with a separate function (name TBD). The API could be something like:

:as `((response :to ONE-TIME-PRE-FILTER-FUNCTION) (stream :through FILTER))

Which would mean, "Pass the response struct to ONE-TIME-PRE-FILTER-FUNCTION, which is called in the process buffer (which can inspect the response, set variables, delete the process to cancel the request, etc). After it returns, if the process hasn't been deleted (warning: potential race condition), begin calling the process filter for the streaming chunks of data."

Doing that would mean having to change our argument handling for AS a bit, but it would leave the signature the same and keep the same semantics for THEN/ELSE/FINALLY across streaming and non-streaming requests.

What do you think? Thanks.

alphapapa avatar Feb 27 '24 07:02 alphapapa

Hi @alphapapa,

thanks for you answer and the code snippet with the var shared across the handlers. That's what I also had in mind, as the solution to get access to the headers in the callback that handles the chunk. I think it's a bit tedious though. To your other points about the THEN callbacks:

  1. I think we already changed the semantics of THEN to something unexpected, @ahyatt "complained" it was called BEFORE the chunk function and not AFTER it.

  2. Why would it force parsing the response or which part of it? I think we have to parse status and headers always, right? Or at least the status. The installed process filter gets called with the PROCESS and a STRING (a chunk). The first (few) chunk(s) will contain status and header and mabye a bit of the body. When called the procces filter inserts those chunks into the process buffer. But only until we received the status and header, and for this we have to parse the status and the headers, and maybe a small part of the body. We could construct a plz-response struct, set the status and headers, and set the body to the first chunk of the body, store this as a process property and call the CHUNK-FN. From then on we could just set the body of the stored response struct to the new chunk and call the user provided CHUNK-FN. And this should be fast, get a process property, set a slot, call a handler. The process filter does not insert anyhing from then on, nor does it parse anything. This is what the process function in the current implementation does, more or less. So are you concerned that we have to parse a small portion of the body, in the case it arrives with the status and the headers?

  3. I think I don't follow here. With what I oulined above in point 2, we don't really parse anything when a new chunk arrives. The process filter takes the STRING chunk it received as the 2nd argument, gets the response struct from the process properties, sets the body slot to the new chunk and calls the user provided CHUNK-FN. Is this response struct stored in the process property the cache you are concerned about?

  4. Why would it complicate the siganture? It would be as complicated as it already is (it depends on the :as parameter). The argument to it can be a string, a buffer, a response struct.

This is how I think about it, and as a plz.el user, the behaviour I outlined would feel more natural/practical to me. I don't want to push this though, people have different expectations.

So if @ahyatt thinks it is useful as well, my next step would be to change the current implementation to:

  • call a ONE-TIME-PRE-FILTER-FUNCTION when provided, via :as
  • change the THEN callback to be called AFTER the request is done successfully, with a response object that contains only the status and header, since we don't want to accumulate a possibly large body.

Wdyt?

r0man avatar Feb 27 '24 10:02 r0man

Hi @r0man and @ahyatt,

One of the principles I've tried to follow with plz is to not do unnecessary work. So, e.g. we don't parse and make a plz-response struct unless requested or an error occurs. I can imagine that there would be cases where the user wants a streaming response and doesn't care about any of the headers, so for us to parse them and make a response struct would be wasteful (not extremely so, and maybe it wouldn't matter, but I'm trying to follow the principle).

Another principle is, roughly, to make common things simple and less common things possible (or something like that; I've seen it worded in different ways). So the plz API doesn't try to provide built-in ways to do everything; it tries to make the common cases very simple and natural to express, while providing enough flexibility for users with more specific needs to do so using the same framework. So while it might seem "tedious" to bind a variable and close over it in your THEN/ELSE/FINALLY callbacks, it's really a simple, idiomatic way to do it in Lisp. And it saves plz from having to implement features that aren't always needed by users, which keeps our code simpler.

These principles are important to me in plz's development. It's not intended to be made for specific downstream needs, but to be simple, efficient, and flexible. Streaming responses are general enough to be worth supporting, and specific enough to probably merit implementation in plz itself. However, as I think our discussion has shown, they could already be supported on top of plz's existing API (or, at least, with a small change to allow setting the process filter--but even now it could probably be hacked to do that after plz returns the process object); our friendly haggling is about how to make the API seem nice.

I'm glad that both of you are involved in this discussion; it's important that users be involved in the development of features like this. But it seems like we each have our own ideas about what we'd like the API to be like, because we're each approaching the issue from different perspectives.

So at this point I think I will fall back on my earlier suggestion: rather than integrate streaming into the plz function's API now, and spend time trying to integrate our perspectives into the existing API, it would be easier to make a plz-stream function that builds on top of it and allows experimenting with various APIs. Once we've arrived at what seems like a good solution, it would then probably be easier to decide how to integrate it into the plz function (or perhaps even defer that decision; maybe it's specific enough of a need that a plz-stream function would be okay to have).

Does this make sense to you? Thanks for your work and your feedback.

alphapapa avatar Feb 27 '24 12:02 alphapapa