langchain icon indicating copy to clipboard operation
langchain copied to clipboard

add streaming capability to FakeMessagesListChatModel

Open thehunmonkgroup opened this issue 2 years ago • 6 comments

Attempt to add streaming capability to FakeMessagesListChatModel.

This is an extension of the work from #10152 and #10116

Not sure I've implemented this correctly, but it is working in my testing.

@hwchase17 happy to make any adjustments if the PR needs improvement.

thehunmonkgroup avatar Sep 04 '23 23:09 thehunmonkgroup

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Visit Preview Aug 29, 2024 8:41pm

vercel[bot] avatar Sep 04 '23 23:09 vercel[bot]

@hwchase17 any feedback on this? I'm using this new feature in all my testing and it's working very well.

thehunmonkgroup avatar Sep 10 '23 16:09 thehunmonkgroup

Also sorry for the delay on this one!

efriis avatar Nov 29 '23 02:11 efriis

This is cool! It needs to tolerate both types though.

My suggestion would be:

if response is a list:

  • concat them together into a message for generate
  • stream each chunk for stream

if response is a message:

  • return the message for generate
  • stream a single big chunk message for stream, like some of the other runnables do

I'm not sure I understand the limitation you're pointing to. The examples I offered in other comments work well in my testing, if that's not what you mean, can you offer a specific example of a case my code doesn't handle?

thehunmonkgroup avatar Nov 29 '23 15:11 thehunmonkgroup

@efriis bump -- this was one of the PRs that accidentally got closed.

I handled the merge conflicts due to the relocation of the class into langchain_community and re-tested it -- still works flawlessly on my rather complicated test cases.

I believe I've addressed your concerns above, but if not, please do clarify -- would love to finally get this merged!

thehunmonkgroup avatar Feb 08 '24 00:02 thehunmonkgroup

@efriis bump again. Fixed the type-checking and linting errors, and ran this through my existing test suite, as far as I can tell all cases are now handled. What else do we need to get this in?

thehunmonkgroup avatar May 15 '24 19:05 thehunmonkgroup

@thehunmonkgroup thanks for this, could you add some tests for the new functionality? Namely calling both invoke and stream on this fake chat model with all possible types of responses:

  • list of message
  • list of str
  • list of list of str
  • list of list of message

nfcampos avatar Jun 18 '24 15:06 nfcampos

@nfcampos requested tests added.

Please, for the love of all things holy, land this PR ;) This has been a real exercise in stamina...

thehunmonkgroup avatar Jun 22 '24 23:06 thehunmonkgroup

@thehunmonkgroup hi, it seems your code changes mean some responses types only work for streaming, and some types only work for invoke, we can't have that, all types need to work for all methods. I've added a failing unit test that shows this

nfcampos avatar Jun 24 '24 14:06 nfcampos

@nfcampos if you really want to do that, then we need to make some decisions about how. I've spent a LOT of time on this PR, and I don't want to do any more work unless I have a clear picture of what is needed to get this committed.

The entire purpose of having the two different types (lists, and lists of lists), is to support multiple responses for both streaming chunks and non-streaming full responses.

In a testing environment, it doesn't really make sense to test non-streaming with lists of lists, and streaming with lists. But if you really want to support all types for both streaming and non-streaming, then we need to decide how we'd transform lists of lists if they were used for non-streaming, and lists if they were used for streaming.

I would propose this:

  • For non-streaming requests, if a response is a list of lists, then the inner list would have all elements concatenated into a single response string.
  • For streaming requests, if a response is a list, then each list element would be wrapped inside a list.

This would enabled a regular list to be passed for streaming, and it would stream only a single chunk for that response, and for non-streaming, all chunks would be put together into a single response.

Will this approach be sufficient to meet your requirements?

thehunmonkgroup avatar Jun 24 '24 21:06 thehunmonkgroup

after playing a bit, this actually has some pretty big outstanding issues related to what happens when a "streaming expected" response gets invoked, or a non-streaming expected response gets streamed.

closing for now. The tests and new types from code review should be kept as they are now if someone reopens

efriis avatar Aug 29 '24 20:08 efriis