Ben Cox
Ben Cox
It would be great to be able to mix the benefit of `RecursiveCharacterTextSplitter` (i.e. splitting by useful things like sentences and paragraphs) with custom tokenisers (i.e. the knowledge that a...
Thanks a lot @bhperry, it wasn't clear to me that it worked for the subclasses like RCTS so I'm very happy to have that confirmed.
@bhperry Are there recommendations for settings we need to apply to the splitter or tokenizer when attempting to using `RecursiveTextCharacterSplitter` with a custom length function? The calculation isn't coming out...
Ah, nice! Little tweak, just need to count the list length: ```python def length_function(text): return len(tokenizer.encode(text, add_special_tokens=False)) ``` Thank you!