langchain
langchain copied to clipboard
Fix SQLAlchemy truncating text when it is too big
Fixes SQLAlchemy truncating the result if you have a big/text column with many chars.
SQLAlchemy truncates columns if you try to convert a Row or Sequence to a string directly
For comparison:
-
Before:
[('Harrison', 'That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio ... (2 characters truncated) ... hat is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio ')] -
After:
[('Harrison', 'That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio ')]
Who can review?
Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested:
I'm not sure who to tag for chains, maybe @vowelparrot ?
so on one hand this seems reasonable
on the other hand, this may actually weirdly be desirable as LLMs have a limited context length so tough to make work with long strings so this could have actually been saving some stuff...
Sure it could be saving usage, but for me, it's sometimes hallucinating and sometimes it returns the action JSON, like here:
Maybe it's because the truncation is happening in the middle of the text and cutting worlds, for my scenario I do need those values I know they are big but it's a requirement I have, do you think we should add an option like max_char and that is applied on every column and the truncate happens at the end, without truncating word but adding ...? so it's going to be a conscient decision and not a hidden one?
@hwchase17 I'm going to add this truncate feature and make it default to the same as in SQLAchemy so we keep the behavior but the user can decide if they want to change it or not, I've one question is there a good place to put utility functions, I'd like to put that string truncate function in a place that can be reused, utils.py perhaps?
@hwchase17 I've added options to change the "truncation" size and add defaults so the behavior will be almost the same as before, but cleaner and without truncating the chain, let me know if you want any other changes.
@eyurtsev I think I've covered it all, let me know if we need more changes
Hey @eyurtsev do you wanna any other fix in this PR? thanks in advance.
Thanks for pinging me! @wsantos