langchain icon indicating copy to clipboard operation
langchain copied to clipboard

Fix SQLAlchemy truncating text when it is too big

Open wsantos opened this issue 2 years ago • 3 comments

Fixes SQLAlchemy truncating the result if you have a big/text column with many chars.

SQLAlchemy truncates columns if you try to convert a Row or Sequence to a string directly

For comparison:

  • Before: [('Harrison', 'That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio ... (2 characters truncated) ... hat is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio ')]

  • After: [('Harrison', 'That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio ')]

Who can review?

Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested:

I'm not sure who to tag for chains, maybe @vowelparrot ?

wsantos avatar May 24 '23 18:05 wsantos

so on one hand this seems reasonable

on the other hand, this may actually weirdly be desirable as LLMs have a limited context length so tough to make work with long strings so this could have actually been saving some stuff...

Sure it could be saving usage, but for me, it's sometimes hallucinating and sometimes it returns the action JSON, like here:

image

Maybe it's because the truncation is happening in the middle of the text and cutting worlds, for my scenario I do need those values I know they are big but it's a requirement I have, do you think we should add an option like max_char and that is applied on every column and the truncate happens at the end, without truncating word but adding ...? so it's going to be a conscient decision and not a hidden one?

wsantos avatar May 25 '23 13:05 wsantos

@hwchase17 I'm going to add this truncate feature and make it default to the same as in SQLAchemy so we keep the behavior but the user can decide if they want to change it or not, I've one question is there a good place to put utility functions, I'd like to put that string truncate function in a place that can be reused, utils.py perhaps?

wsantos avatar May 25 '23 15:05 wsantos

@hwchase17 I've added options to change the "truncation" size and add defaults so the behavior will be almost the same as before, but cleaner and without truncating the chain, let me know if you want any other changes.

wsantos avatar May 26 '23 01:05 wsantos

@eyurtsev I think I've covered it all, let me know if we need more changes

wsantos avatar May 28 '23 18:05 wsantos

Hey @eyurtsev do you wanna any other fix in this PR? thanks in advance.

wsantos avatar Jun 01 '23 23:06 wsantos

Thanks for pinging me! @wsantos

eyurtsev avatar Jun 02 '23 01:06 eyurtsev