langchain
langchain copied to clipboard
safely interrupt streaming requests
Enables raising an error from a callback handler to interrupt streaming requests.
Handles cleanup of the underlying HTTP request.
Here's an example usage of a callback handler that I've tested as working:
class VerboseStreamingStdOutCallbackHandler(StreamingStdOutCallbackHandler):
@property
def always_verbose(self) -> bool:
"""Whether to call verbose callbacks even if verbose is False."""
return True
def make_interrupt_streaming_callback_handler(backend):
class InterruptStreamingCallbackHandler(VerboseStreamingStdOutCallbackHandler):
def on_llm_new_token(self, token: str, **kwargs: Any) -> None:
if not backend.streaming:
message = "Request to interrupt streaming"
backend.log.info(message)
raise EOFError(message)
return InterruptStreamingCallbackHandler()
I have not investigated the _agenerate method, or any other LLMs yet -- would like to get some feedback/confirmation on the approach first.